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0 Extended nucleotide sequences. 



(g) A method of characterising a genomic DNA sample witli reference to at least one informative genetic locus. 
The method comprises amplifying the minlsatellite sequence at at least one informative genetic locus by the use 
of flanking primers and extension thereof. Novel primers, extension products thereof and diagnostic kits for use 
in the above method. The method is of particular use for genetic characterisation purposes, paternity and 
maternity testing as well as forensic testing. 
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EXTENDED NUCLEOTIDE SEQUENCES 



The present invention reiates generally to a method of characterising a genomic DNA sample and to 
nucleotide sequences employed In the method. In particular the invention involves the use of nucleotide 
sequences comprising oligonucleotides hybridisable to regions adjacent informative genetic loci. The 
method of the invention may for example be used in paternity disputes, forensic medicine or In the 
5 prevention, diagnosis and treatment of genetic disorders or predispositions. The method is of particular use 
where only as little as one molecule of an informative locus is present in the genomic DNA sample. 

Methods of genetic characterisation are known in the art. In UK patent no. 2166445 (Lister Institute) 
there are described various DNA sequences which may be used as probes to hybridise simultaneously to a 
number of polymorphic sites within animal, e.g. human, and plant genomes enabling the production of a 
70 "DNA fingerprint" composed of marked DNA bands of differing molecular weights. The DNA fingerprint as a 
whole is characteristic of the individual concerned and the origin of the differing bands can be traced 
through the ancestry of the individual and can in certain cases be postulated as associated with certain 
genetic disorders. In European Patent Application, Publication No. 238329, there are described various DNA 
sequences which may be used as probes to hybridise Individually at individual polymorphic sites within the 
15 animal, for example human genome. Methods of genetic characterisation using one or more of such probes 
are described. 

Tandem-repetitive minisatellite regions in vertebrate DNA frequently show high levels of allelic vari- 
ability in the number of repeat units [1-4]. Hybridization probes capable of detecting multiple minisateDites 
and producing individual-specific DNA fingerprints have been developed [5-7], as well as cloned human 

20 minisatellites which provide locus-specific probes for individual hypervarlable loci [5, 8-10]. These highly 
informative genetic markers have found widespread application In many areas of genetics, including linkage 
analysis [9, 11-13]. determination of kinship in for example paternity and immigration disputes [6. 10, 14. 
15], monitoring bone marrow transplants [16,17], and for individual identification in forensic medicine [10. 
18. 19]. Applications to typing forensic samples such as blood and semen strains or hair roots are however. 

25 linriited by the sensitivity of the hybridization probes, which require at least 50ng of relatively undegraded 
human DNA for typing with locus-specific minisatellite probes [10] and 0.01- g DNA for analysis with 
muitilocus DNA fingerprint probes [6]. 

Where a sufficient amount of sample DNA is available in the test sample the above disclosures provide 
valuable and reliable methods of genetic characterisation. However the efficiency of the above techniques is 

30 reduced where the amount of genomic DNA in the test sample is limited, for example in forensic 
applications where often only small copy numbers of the test DNA molecule are available. 

It is therefore desirable to provide a further method of genetic characterisation which is especially 
suitable for small samples of genomic DNA. 

K. Kleppe et al in J. Mol.Biol. (1971), 56, 341-361 disclose a method for the amplification of a desired 

35 DNA sequenceTTFe method involves denaturatlon of a DNA duplex to form single strands. The denaturatlon 
step Is carried out in the presence of a sufficiently large excess of two nucleic acid primers which hybridise 
to regions adjacent the desired DNA sequence. Upon cooling two structures are obtained each containing 
the full length of the template strand appropriately complexed with primer. DNA polymerase and a sufficient 
amount of each required nucleoside triphosphate are added whereby two molecules of the original duplex 

40 are obtained. The above cycle of denaturatlon, primer addition and extension are repeated until the 
appropriate number of copies of the desired DNA sequence is obtained. It is indicated that adjustment of 
the primer concentration may be required. 

The above method Is now referred to as polymerase chain reaction (PGR) as claimed In United States 
patents nos. 4683195 and 4683202 wherein amplification of a given nucleic acid sequence on a template Is 

45 effected by extension of a nucleic acid primer in the presence of Taq. polymerase or the Klenow fragment 
of E.coli DNA polymerase I. The amplification procedure is generally repeated for up to about 50 cycles. 
The examples provided only relate to short DNA sequences, generally of a few hundred base pairs. 

The enzymatic amplification of DNA by the polymerase chain reaction (PGR, [20]) enables such smaller 
amounts of human DNA to be analysed. The remarkable specificity of thermostable Taq polymerase has 

50 greatly simplified PGR [21 ] and has allowed typing of some classes of human DNA polymorphism to be 
extended to single hair roots [22] and indeed to Individual somatic cells and sperm [23]. In most work to 
date, PGR has been used to amplify short regions of human DNA, usually a few hundred base pairs long 
[21-23]. Base substitutional polymorphisms can be detected by hybridizing PGR products with allele- ' J 
specific oligonucleotides [22, 23], by DNA sequence analysis of PGR products [24], or, if the base 
substitution affects a restriction site, by cleavage of PGR products with a restriction endonuclease [25]. 
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Deletion/insertion polymorphisms can likewise be analysed by sizing PGR products by gel electrophoresis 
[22]. Most of these marker systems are however dimorphic and their utility in for example forensic medicine 
is limited by their relatively low variability in human populations. 

As explained above PGR has been restricted to the amplification of relatively short DNA sequences and 

6 there are serious doubts as to whether PGR may be employed successfully in relation to long DNA 
sequences especially if faithful reproduction of such sequences is necessary. Indeed 2kb has been stated 
to represent the absolute limit of PGR [21]. Further difficulties arise however where the sample DNA 
contains repetitive sequences, for example tandem repeats of a particular core or consensus sequence 
such as those found at highly informative hypervariable loci. 

70 Thus it can be predicted that hybridisation will take place between amplification products thus resulting 
in networking and the premature termination of the PGR reaction at an annealed site. Such incomplete PGR 
products could then act. by out of register annealing to the complementary minisatellite strand, as a 
substrate for extension in the next cycle of amplification. This will result in the generation of multiple 
spurious amplification products. The application of PGR to the amplification of minisatellites, particularly 

75 long minisatellite alleles containing many repeat units. Is thus likely to Involve the loss of sequence fidelity 
and yield such a large number of amplification products that the results of such a process are neither 
meaningful nor accurately reflect the minisatellite alleles in the starting genomic DNA. 

Whilst E. Boerwinkle and L Ghan (1988) in an abstract given out at the 39th Annual |y/leeting of the 
American Society of Human Genetics In New Orleans (October, 1988) [Abstract (0548) 12.5] refer to the use 

20 of PGR on tandemly repeated hypervariable loci, the size of the targeted region which they employed was 
relatively small, being always less than one kilobase.in length. Moreover they do not disclose the way in 
which they applied the PGR technique to achieve faithful reproduction of sequences and the production of a 
meaningful and accurate result. 8. Odelberg et al. In a disclosure to the American Academy of Forensic 
Science (February. 1988) on the other hand have described their unsuccessful application of PGR to larger 

25 minisatellites and their findings confirm the predicted difficulties discussed hereinbefore since they are 
unable to use PGR to obtain any meaningful or useful result in view of the multiplicity of amplification 
products detected. 

The present Invention Is based on the discovery that as little as one molecule of an informative genetic 
locus in a genomic DNA sample may be faithfully amplified many times to yield amplification products 
30 which are useful for genetic characterisation purposes by effecting the amplification within a defined window 
such that sufficient of the desired extension product is generated to be detectable but that the yield of 
extension product is inadequate to permit substantial out-of-register hybridisation between complementary 
tandem-repeated template strands. 

According to one feature of the present invention there is provided a method of characterising a test 
35 sample of genomic DNA by reference to one or more controls, which method comprises amplifying the 
minisatellite sequence at at least one Informative locus in the test sample by 

(i) hybridising the test sample with primer in respect of each informative locus to be amplified, the 
primer being hybridisable to a single strand of the test sample at a region which flanks the minisatellite 
sequence of the informative locus to be amplified under conditions such that an extension product of the 

40 primer is synthesised which is complementary to and spans the said minisatellite sequence of the strand of 

the test sample; .... .t. i * 

(ii) separating the extension product so formed from the template on which it was synthesised to 

yield single stranded molecules; 

(iii) if required hybridising the primer of step (i) with single stranded molecules obtained according to 
45 step (ii) under conditions such that a primer extension product is synthesised from the template of at least 

one of the single stranded molecules obtained according to step (ii). and 

(iv) detecting the amplification products and comparing them with one or more controls; the method 
being effected such that sufficient of the desired extension product is generated to be detectable but such 
that the yield of extension product Is Inadequate to permit substantial out-of-register hybridisation between 

50 complementary minisatellite template strands. ^ ^ ^ 

It will be appreciated that the cycle of hybridising primer to at least one of the single stranded 
molecules obtained by separation of the extension product formed from the template on which it was 
synthesised and then further synthesis of extension product may be repeated as few or as many times as is 
consistent with the need, on the one hand, to generate sufficient extension product to be detectable and on 
55 the other hand to generate a yield of extension product inadequate to permit substantial out-of-register 
hybridisation between complementary minisatellite template strands. 

Where the method of the present Invention is effected using one primer in respect of each informative 
locus It^may be advantageous to subject the test sample of genomic DNA to restriction prior to 
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amplification. In this regard where only one primer is used in respect of an informative locus, extension 
products of varying lengtfi are likely to be formed. For example a series of products may be formed having 
the same s' terminus, but different 3 termini. -A cleaner profile of product is obtainable If the test sample of 
genomic DNA is subjected to restriction prior to amplification. Where restriction is effected any appropriate 
5 restriction endonuclease may be employed. Since the sequence flanking the relevant informative locus will 
be known and the recognition sequence of known restriction enzymes is also known, the skilled man will 
encounter no difficulty in selecting an appropriate restriction enzyme for use. Where only one primer is 
used in respect of each informative locus arithmetical amplification will be achieved, but where two such 
primers are employed exponential amplification may be obtained. The method of the present invention is 

10 thus preferably effected using two primers in respect of each informative locus to be amplified. 

Thus according to a further feature of the present invention there is provided a method of characterising 
a test sample of genomic DNA by reference to one or more controls, which method comprises amplifying 
the minisatellite sequence at at least one informative locus in the test sample by 

(i) hybridising the test sample with two primers in respect of each informative locus to be amplified, 

75 each primer being hybridtsable to single strands of the test sample at a region which flanks the minisatellite 
sequence of the informative locus to be amplified under conditions such that an extension product of each 
primer is synthesised which is complementary to and spans the said minisatellite sequence of each strand 
of the test sample whereby the extension product synthesised from one primer, when it is separated from 
its complement, can serve as a template for synthesis of the extension product of the other primer; 

20 (ii) separating the extension product so formed from the template on which it was synthesised to 

yield single stranded molecules; 

(ill) if required hybridising the primers of step (I) with the single stranded molecules obtained 
according to step (ii) under conditions such that a primer extension product Is synthesised from the 
template of each of the single stranded molecules obtained according to step (Ii); and 

25 (iv) detecting the amplification products and comparing them with one or more controls; the method 

being effected such that sufficient of the desired extension product is generated to be detectable but such 
that the yield of extension product is Inadequate to permit substantial out-of-register hybridisation between 
complementary minisatellite template strands. 

As stated above it will be appreciated that the cycle of hybridising primer to each of the single stranded 

30 molecules obtained by separation of the extension product formed from the template on which it was 
synthesised and then further synthesis of extension product, may be repeated as few or as many times as 
is consistent with the requirement on the one hand to generate sufficient extension product to be detectable 
and on the other hand to generate a yield of extension product inadequate to permit substantial out-of- 
register hybridisation between complementary minisatellite template stands. 

35 An informative locus includes regions of genomic DNA by reference to which individuals can be 
distinguished. Hypervariable loci, of which there may be over 1000 in the human genome, may be useful as 
informative loci. The distinguishing power of an informative genetic locus is often referred to in terms of 
allelic variation or polymorphism. Generally, the greater the degree of allelic variation or polymorphism 
between individuals the greater the distinguishing power of the locus in question. As a convenient guide 

40 informative genetic loci include those In which at least 3 different alleles can be distinguished in any sample 
of 100 randomly selected unrelated individuals. It will be appreciated that the term individual has been used 
above to refer not only to humans, but also to other animals as well as to plants and to cell lines derived 
from such humans, animals and plants. In each case, however the sample of randomly selected unrelated 
individuals will all be from the same species. 

46 In respect of the human applications of the present invention the expression "informative genetic locus" 
as used herein may alternatively be defined as one at which at least 3 different alleles can be distinguished 
in DNA extracted from any 20 cell lines selected from the following, which cell lines have been deposited 
with the American Type Culture Collection (ATCC):- 

50 
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Cell line ATCC Deposit No. 



Hela 



CCL2 



RPHI 2650 CCL 30 

Detroit 532 CCL 54 

Detroit 525 CCL 65 

Detroit 529 CCL 66 

Detroit 510 CCL 72 

WI-38 CCL 75 

Citrullinemia CCL 76 



i 
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Cell line ATCC Deposit- No. 



5 


EB-3 


CCL .85 




RAJI 


CCL 83 




JIYOYE (P-2003) 


CCL 87 


10 


WI-26 


CCL 95 




Detroit 551 


CCL 110 




RPMI 6666 


CCL 113 


15 


RPMI 7666 
CCRF-CEM 


CCL 114 
CCL 119 




CCRF-Sa 


CCL 120 




HT-1080 


CCL 121 


20 


HG 261 


CCL 122 




CHP3 (M.W.) 


CCL 132 




LL47 (MaDo) 


CCL 135 


25 


HEL 299 


CCL 137 




LL 24 


CCL 151 




HFLI 


CCL 153 


30 


WI-1003 
MRC-5 


CCL 154 
CCL 171 




IMR-90 


CCL 186 




LS 174T 


CCL 188 




LL 86(LeSa) 


CCL 190 




LL 97A (AIMy) 


CCL 191 




HLF-a 


CCL 199 


40 


CCD-13LU 


CCL 200 




CCD-8LU 


CCL 201 




CCD-llLu 


CCL 202 


45 


CCD-14Br 


CCL 203 




CCD-16LU 


CCL 204 




CCD-18LU 


CCL 205 


SO 


CCD-19LU 
Hs888Lu 


CCL 210 
CCL 21 




MRC-9 


CCL 212 




Daudi 


CCL 213 


55 


CCD-25LU 


CCL 215 
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30 



Cell line ATCC Deposit No. 

s 

SW403 CGL 230 

NAMALWA CRL 1432 

Maryland 20852-1776. USA and are listed in the ATCC catalogue of Cell Lines and Hybridomas All tiie 
above-mentioned cell lines were on deposit with the ATCC prior to 1985. "yonaomas. All the 

Convenient informative loci for use in the present invention include loci to which nucleotide sequences 
and probes disclosed in European patent application no. 238329 are capable of hybridisation as well as 
,6 hose informative ioci the flanking sequences of which are disclosed in tl^s applicatL. pSher' inform^ ive 
^.^Art VooT'' ""t "'^"''""^ minisatellite probes disclosed by S J Gendlerer^ in 

PNAS 84. ( 987). pages 6060-6064. A particularly informative locus is the 5 alpha globin HVA dScllsed 
by the American Journal of Human Genetics. 1 988. 43, pages 249-256. a'sciosed 
Convenient Informative loci for use in the present invention include those of up to 15 kilobases IVIore 
rr/rr n"l?,T ' ^^^^^^ ^f^^*^ P^^'We short alleles, the upper limit being for exampTe of 

iTiri , °. particularly of less than 8. more particularly less than 6 kilobases. Conveniently the 
rncludeTl . p • f '^^^ 1-5. "^ore particularly at least 2 kilobases. Thus suitable ranges 
include 1. 1.5 2. 2.5 or 3. to 6. 7. 8. 9 or 10 kilobases. In respect of the performance of the method of the 
invention st^ng with only a single sample DNA molecule it Is preferred that the length of the alleles does 
s not exceed 6 kilobases. Also preferred are informative loci with a restricted range of allele length It has 
been observed that these preferences are not always compatible with highly informative loci since these are 
usually associated with large numbers of minisatellite repeat units and long alleles. The skilled man will 
however have no difficulty in selecting convenient loci for his purpose. 

Primers capable of hybridisation to regions of sample DNA flanking informative loci are required to 
0 provide points for the initiation of synthesis of extension products across informative genetic loci Such 
primers are generally oligonucleotides and are preferably single stranded for maximum efficiency in 
amplification, but may altematively be double stranded. If double stranded, the primer is first treated to 
separate its strands before being used to prepare extension polynucleotides. Primers must be sufficiently 
long to prime the synthesis of extension polynucleotides. The exact length of such primers will depend on 
35 many factors including temperature and the nature of the primer. Generally a primer will comprise at least 
seven nucleotides, such as 15-25 nucleotides, for example 20-25 nucleotides. It will be appreciated that the 
flanking sequence need not reflect the exact sequence of the flanking region of the informative locus It is 
merely necessary that the respective sequences are homologous to a degree enabling hybridisation The 
extension product so obtained must also be capable of acting as a template for further hybridisation with a 
40 primer. The maximum length of any primer is not believed to be critical and is only limited by oractlcal 
considerations. . ^ 

^Z'^f^ ^^""P'® stranded and it is therefore convenient to select primers 

which hybridise to different strands of the sample DNA and at relative positions along the sequence such 
that an extension product synthesised from one primer can act as a template for extension of the other 
45 pnmer into a nucleic acid of defined length. 

The flanking regions of certain of the informative genetic regions identified by the probes disclosed In 
UK Patent No. 2166445 and European patent application no. 238329 referred to above and nucleotide 
sequences hybridisable thereto have not been previously disclosed and both individually arid in any 
combination represent further aspects of the present invention. The sequences are set out on pages 
50 preceding the Examples. The informative genetic regions are indicated on some sequences by the word 
f^lNISATELLITE. As explained above any convenient sequence may be used as a primer. Examples of 
some convenient primer sequences have been underlined. For clarity and brevity, on some sequences only 
the outermost repeat unit on each side of each minisatellite tandem array is shown (in capitals), separated 
by a series of x's to indicate the omission of most repeats. Since there is seldom a whole number of repeat 
55 units in each array, these outermost repeats may be out of register with each other, but preserve the 
correct relation to the immediate flanking DNA. 

A previously unsuspected tandem repeat region was found by sequencing pIVISSI (EP-238329), this 
array is separated from the major minisatellite by 10 bases, and consists of more than 7 repeats of a 'l9bp / 
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I 



G/C rich sequence which extends to the Sau SAi site defining the end of the clone. This region, designated 
31 B, is G-rich on the opposite strand to the G-rich strand of the major mihisateliite. Genomic mapping 
shows that, if variabie at ail, 31 B mal<es a minimal contribution to the length variation at this locus (data not 
shown). 

5 Two loci were detected by plVIS228 (Armour et al, 989. NAR. 17, 13, 4925-4935) at high stringency. 
Family analyses showed these loci to be tightly linked. Restriction mapping of the cloned DNA, subse- 
quently confirmed by sequence analysis, demonstrated the presence of two distinct tandem repetitive 
regions, designated 228A and 228B (Figure 8). Reprobing human DNA with subclones from pMS228 
showed that 228A detected the larger, more intensely hybridising locus, and 228B the smaller, more faintly 

10 hybridising alleles. These minlstallites were localised, using somatic cell hybrids and in situ hybridisation, to 
17p13-pter. The properties of the loci detected by pMS51 and pMS228, and by other mini satellites are 
summarised In Table 2. 

Previous evidence for the disposition of a pair of minisatellites in PMS43 as descrobed In our UK patent 
application 8813781.5 was confirmed by sequence data, and two further examples (pMS31 and pMS228) of 

75 minisatellite clones containing two closely adjacent tandem repeat arrays were elucidated. Unlike pMS43, In 
which the 'minor' locus 43B has a low (30%) heterozygosity, both the minisatellites in pMS228 have 
heterozygosities greater than 80%. The minisatellite 228B (figure 8, table 2) combines a high heterozygosity 
(85%) with a limited allele size range (0.6 -5.5kb); this combination makes it a very useful locus for analysis 
of minisatellites by the polymerase chain reaction [23]. In general, the most variable loci have a wide range 

20 of allele sizes such that many alleles will exceed the maximum size currently amplifiable by this method, so 
giving an incomplete profile. At the locus detected by 228B, in contrast, a survey of 48 unrelated people 
showed that 95% of alleles were smaller than 2kb. and that the largest (5.5kb) was well within the 
amplifiable range (J Armour and A Jeffreys, unpublished). We have shown that even the largest allele at this 
locus can be amplified, providing the prospect of a complete and yet usefully informative profile from 

25 amplification at a single locus. 

A novel cloned minisatellite termed MS29 has been isolated which unusually detects two variabie loci in 
human DNA. One locus, located in the terminal region of the short arm of human chromosome 6, is also 
present in great apes. The second minisatellite locus is located interstitially on chromosome 16p11. and is 
absent both from non-human primates and from some humans. MS29 was Isolated from a L47 genomic 

30 library made from DNA enriched for minisatellite sequences as described elsewhere (Wong et al, 
Ann.Hum.Genet. (1987) 5j[. 260-288. MS29 comprises a 39bp repeat sequence. This consists of 13 
diverged repeats of the trinucleotide YAG wherein Y represents any one of A. G. C or T. The flanking 
sequences of this minisatellite can be elucidated using sequencing techniques known per se. 

In summary the relevant novel flanking sequences are those of MSI, MS29. MS31A and MS31B, MS32. 

35 MS43A and MS43B, MS51 , MS228A and MS228B. In addition we now also provide further novel flanking 
sequence information for the rriinisatelllte probe p g3 (Wong et al. Nucleic Acids Research 14, 11, (1986), 
4605-4616). as well as for the probes 33.1. 33.4 and 33.6 (UK Patent 2166445yThe Lister Institute of 
Preventive Medicine). 

A particular group of flanking sequences are those of the minisatellite probes MS1. MS29, MS31A and 
40 MS31 B. MS32. MS43A and MS43B. MS51 , MS228A and MS228B. 

Further particular groups of flanking sequences are comprised by those of any of the above minisatellite 
probes. 

As stated previously the minisatellites MS29 and MS31 B are novel and the repeat sequences and/or 
flanking sequences of either of these minisatellites represent further particular aspects of the invention. 

45 The method of the present invention can be used In respect of any convenient informative locus. Thus, 
for example primers may be prepared for hybridisation to the flanking sequences of for example the 
hypervariable region 5 to the human insulin gene (Am.J.Hum Genet. 1986. 39, 291-229), of the s' alpha 
globin HVR (Am.J.Hum. Genet, 1988, 43 , 249-256). of the alpha globin 3 HVR (EMBOJ. 1986. 5, 1957- 
1863), of the hypervariable region at the zeta globin locus (PNAS, 1983. 80. 5022-5026), or of the""Ha Ras 

50 locus (Nature, 1983,302 33-37). Also incorporated by reference in this application are a panel of probes 
proposed by Ray White? The flanking sequences of these may be elucidated using known techniques. 

The extended flanking polynucleotides prepared according to the method of the invention represent a 
further aspect of the Invention. These polynucleotides may be present In single or multiple copies. Where 
multiple copies are present these are faithful copies and preferably substantially free from networking or 

55 cross linkage between individual strands. By the term faithful we mean that the genetic characterisation 
Information to be obtained from the size and composition of the copy is essentially the same for all copies. 
Conveniently the extended primers comprise a sequence Identical with or complementary to an Informative 
genetic locus of more than 1 kllobase. Other convenient values include 1.5 and 2 kilobases. Conveniently 
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the number of copies represent the products of at least 3, 5. 7, 9. 13 or 15 cycles of the amplification 
method of the present invention. 

According to a still further aspect of the Invention there is provided a mixture containing multiple faithful 
copies of extended primers (as hereinbefore defined). As above the mixture is substantially free from 
networl<ing or cross-linkage between individual strands. 

The primers for use in the present invention may be prepared by methods analogous to those l<nown in 
the art. For example where a given flanking sequence is known, a convenient nucleotide primer may be 
prepared by direct synthesis. Cloning techniques may also be used to reproduce DNA fragments containing 
sequences flanking informative genetic loci. 

Alternatively DNA fragments comprising informative genetic loci may be identified and using a 
procedure analogous to that of the methods outlined above, a nucleotide sequence hybridisabie to the 
informative genetic locus may be extended. The product so obtained may then be directly modified, for 
example by cleavage, to prepare a convenient primer or the sequence thereof may be determined and 
convenient primers then prepared by direct synthesis. 

As explained above the sample DNA Is generally double stranded. It is therefore desirable to separate 
the strands of the nucleic acid before it can act as a template for extension of a primer. Strand separation 
can be accomplished by any suitable method Including physical, chemical or enzymatic means. Conve- 
niently heat denaturation is used to provide up to about 99% denaturation. Typical temperatures used 
include 85-1 05°C for times ranging from about 1-10 minutes. 

A primer is preferably used for each unique strand of the sample DNA. In respect of double stranded 
DNA two primers are therefore normally used. Generally the primers are chosen to allow extension to 
proceed across the informative locus in the same direction along both strands, that is to say 5 to 3 or vice 
versa. Preferably extension proceeds 5 to 3 . 

Primers are conveniently hybridised with the genomic DNA sample under known conditions. Generally 
this is allowed to take place in a buffered aqueous solution, preferably at a pH of 7-9. Preferably a large 
excess of primer is present in the reaction mixture. It wilt be appreciated that in certain diagnostic 
applications the amount of sample DNA may not be known but an excess of primers is desirable. 

The amplification is conveniently effected by adding the deoxyribonucleoside triphosphates dATP. 
dCTP dGTP and dTTP to the synthesis mixture in adequate amounts and the resulting solution may then, 
for example be heated to for example about 90*^-1 00'»C for a period of for example from about 1 to 10 
minutes. After this heating period the solution may be allowed to cool to 30-70 C which is preferable for the 
primer hybridisation. Where Taq polymerase is used as the inducing agent to facilitate the extension 
reaction it is preferable to effect primer hybridisation at a temperature of from 50-70 C for example 50 - 

65°C. such as 60°C. . . ^ ^ x ^ ^ i 

It has been found that primer hybridisation is advantageously effected in a buffer of reduced Ionic 
strength and at an elevated annealing temperature. This reduces the possibility of mispnming. The 
expressions "reduced ionic strength" and "elevated annealing temperature" are to be understood as 
referring to conditions at the extremes of. or outside those ranges which would generally be considered 
appropriate by the molecular biologist of ordinary skill for a given nucleic acid amplification reaction. By 
way of illustration, but not limitation a convenient buffer will be of sufficient ionic strength to enable a stable 
pH of 7-9 to be maintained. Convenient annealing temperatures are generally 50-60*^0. It will be appreciated 
that the actual reaction conditions will depend on the primer(s) used. 

An appropriate agent for Inducing or catalysing the extension reaction may then be added to the 
annealed mixture and the reaction may then be allowed to proceed under conditions known In the art. 

The inducing agent Is conveniently an enzyme. Suitable enzymes include the Klenow fragment of E. 
coli DNA polymerase I. T4 DNA polymerase, but particularly Taq polymerase. Other available DNA 
polymerases, reverse transcriptase and other enzymes, including heat stable enzymes which will facilitate 

the extension reaction may also be used. ^ . ^ ^ 

The newly synthesised strand and its complementary nucleic acid strand form a double-stranded 
molecule which is used in the succeeding steps of the method. The steps of strand s9Pa^a«on and 
sequence extension may then be repeated as required using the procedures outlined above. Suitable 
coSdUions for such procedures are outlined in US patent nos. 4683202 and 4683195 incorporated by 
reference above. As explained In the above patent specifications the separation and extension cycles may 
be performed stepwise or advantageously a plurality of cycles are operated for example m a semi- 
fliitnmated or fully automated manner. 

Inte clveLnal PGR technique the amplification reaction is generally allowed to proceed for up to 
aboS 50 Seles. In contrast, acc^ordlng to the method of the present Invention the "PP^^ hm-t tor 
IpHficatlon is believed to be about 25 cycles when starting from a single copy of the sample DNA. Where . j 
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larger starting amounts of genomic DNA sample are available fewer cycles of amplification are required. 
The convenient window of amplification cycles depending on the amount of sample DNA and the nature of 
the sequence will now be illustrated, but In no way limited, by reference to the following analysis; 

The following analysis refers to amplification reactions using 10ul volumes and, with extension times of 

5 15 minutes: Considering an individual heterozygous for alleles A and B. where A is longer than B it can be 
predicted that A will therefore amplify less efficiently than B. The lower limit of cycle number Ci is dictated 
by the sensitivity of the probes, which can detect about 0.1 pg product. To detect allele A, sufficient cycles 
are needed to generate >0.1 pg allele A product. Similarly, >0.1 pg allele B must be produced to detect 
both alleles. The upper limit of cycle number Cu is limited by the yield of both alleles A and B. 

10 Ci and Cu can be estimated as follows: 

Let M - Initial mass of human genomic DNA (in picograms). 

a = length of allele A (kb) 

b = length of allele B (kb) 

ga = gain per PCR cycle of allele A 

15 gb = gain per PCR cycle of allele B 

Since the diploid genome size of man is 6x10^ kb then the initial amount of allele A is 

M. _a PS 

20 6 X 10^ 

Similarly, the intiai amount of allele B is 

25 M. b pg 

6 X 10^ 

The yield of allele A after C cycles is given by: 

30 

6 X 10^ 



Sa"^ PS 



35 

and similarly for allele B. 

From Fig. 3 the gain per cycle g is related to allele length L by 
gL ■ 2 - 0.093 L (L = 0-6 kb) 

The lower limit of cycles Ci is therefore set by the lowest value of C where 

40 

M.a (2 - 0.093*L)^ > 0.1 

6 X 10^ 

and M.b (2 - 0.093.L)° > 0,1 

6 X 10^ 

60 The upper limit Cu is likewise set by the highest value, of C where the total yield of both alleles is < 
1000 pg, i.e. 

M.a (2 . 0.093.L)<^ + M.b (2 • 0,093.L)<= <1000 

55 6x10^ 6 X 10^ 

Ci and Cu can be determined by computer reiteration. Some typical examples are: 
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M 


a 


b 


c, 


Cu 


pg 


kb 


kb 






10,000 
1.000 
100 
*6 


5 
3 
6 
6 


1 
2 

0,5 
1 


7 
10 

19 

4- 

27 


20 
24 
27 
32* 



+ note that the window becomes very narrow and can disappear 
for very small amounts of DNA with widely differing allele sizes. 
' N.B. This model does not tal<e into account the fact that 
significant stochastic variation in the number of target molecules 
exists in amounts of human genomic DNA <100pg (equivalent to 
17 cells). The analysis for 6 pg DNA corres onds to the 
amplification of a single target molecule of each allele. 



Homozygotes can be readily accomodated in the model. remains unaltered, whereas C, Is defined 
bv the number of cycles required to give >0.05 pg of each allele (i.e. >0.1 pg combined). 
2° Reactions with different volumes can also be accomodated. Ci is defined by the number of cycles 
required to give >0.1 pg of each allele in the total reaction. C„ is defined by the number of cycles required 
to give <1000 pg total product per lOul of reiStiSn . For large volume PGR reaction. C, will be unchanged 
but Cm will be somewhat larger than with a lOul reaction. 

The window of amplification cycles which generates an appropriate ammount of amplification product 
for characterisation (0.1-4000 pg of product) is for example 10-15 cycles for 100 ng of genomic DNA. 18 
cycles for Ing and 25 cycles for single cell amplification (6pg). The number of PGR °f ^ "^'^ *° "^f. 
Increased to detect larger alleles which amplify less efficiently. Depending on allele length lOOO-IO^pg of 
faithfully amplified product can be obtained. 

A particular advantage of the claimed method is that differences of as little as one repeat unit in an 
informative region may be identified. The method of the present invention is also believed to offer faithful 
eproduction of large informative genetic loci, for example those of up to ISkb. such as up to about JOkb^ 

It has also been found that the problems associated with the networking of amplified copies o sample 
DNA as well as the generation of incomplete extension products can lead to the appearance of significant 
amounts of single stranded minisatellite DNA of heterogeneous size. This single stranded DNA can produce 
sianificant levels of aspecific products in the PGR reaction. This problem may be overcome or at least 
alfeS by the use enzymes which specifically digest or degrade single stranded DNA and which leave 
Sfe stranded DNA intact The preferred enzyme is SI nuclease. Digestion of J P^^^ 
such an enzyme results In a cleaner profile of PGR products revealed at the final detection step Therefore 
n a preferred aspect of the present invention the method of the invention Includes the use o at least one 
^ enlyZ which specifically digests or degrades single stranded DNA whilst leaving double stranded DNA 
intact, whereby to ameliorate the problems of aspecific products of the PGR reaction. „,„H„ot, 

Detection of the amplified products may be effected by any convenient means The an\P''f ^ 
may be identified and characterised, for example using gel electrophoresis and oilowed if desired l^y 
hvbridlsation with probes hybrldisable to the informative loci contained therein followed by for example 
a£Sraprwhe?e radiolabelled probes are used. Convenient procedures are disclosed In European 
Stent ApSon Publication No. 238329. Alternatively more direct methods may be used. Separation of 
^^I TrnS^^o oroducXs on a qel may be followed by direct visualisation thereof. Direct staining with for 
exar^S^^hfiT^^^^^^^ Sted where a sufficient quantity of product is available. The flanking 

SSfde eTue^^^^^^^ alternatively carry a label or marker component and such label or marker may 
therl be detected using any convenient method. Such labels or markers may include either rad.oact.ve and 
non radioactive components but the latter are preferred. ^ »^ k= nmifoH 

The number of informative regions which can be amplified simultaneously .s not believed to be limited, 
othel than rprac icaTLitatlons. For example up to twenty regions could be.amplfied shj^u t^^^^^^^^^^ 
convenSly ten regions and in particular up to eight, seven, six. five or four regions. In a preferred aspect 

" "'^ ??lTanrngTolJ-^^^^^^^ ^ T'^l^ 



11 



EP 0 370 719 A2 



10 



15 



20 



25 



30 



35 



40 



45 



a control sample of DNA and instructions for their use. Examples of convenient and preferred probes 
Include those outlined elsewhere in the specification. The kit may also include reagent nucleotides for 
extension of the flanking polynucleotides across the informative genetic locus and/or extension promoter 
such as an enzyme. 

Examples of flanking sequences to informative genetic regions which may be used in the method of the 
present invention are set out on the following pages. Also disclosed are details of a panel of single locus 
probes which are capable of hybridisation to informative genetic regions. This panel of probes was 
disclosed by Ray White at a meeting in Quantico, Virginia in May 1988. Polynucleotide primers capable of 
hybridisation to regions adjacent to such informative genetic regions may be prepared as previously 
outlined and used In the method of the Invention. Primers may be prepared in respect of any one such 
region or combination of regions to provide extension products of any desired panel of Informative region- 
(s). 

pMSl.lb 5 'flanking sequence exte nding Into mlnlsacelllte. 

5 ' GCCATTTCCATAAACACGTATCGAGTACCTAATACTACAAAGTACCATACCAGGTA.CTAC 

, 1 1 1 1 1 

3 • CGGTAAAGGTATTTGTGCATAGCTCATGGATTATGATGTTTCATGGTATGGTCCATGATG 

ATCCATTAAGAATGTTAATAATTAATCACGCTCATTTGCCATTGATTTTAAGTTTCCATT 

, 1 1 1 1 1 

TAG6TAATTCTTACAATTATTAATTAGTGCGAGTAAACGGTAACTAAAATTCAAAGGTAA 

TTATTAGTGTCAGGTGGTTTTACCAATGGCCTCCCTTTCCTTCATGCTTACACTAAAAAA 

1 1 1 » « ' 

AATAATCACAGTCCACCAAAATGGTTACCGGAGGGAAAGGAAGTACGAATGTGATTTTTT 

AGAACAACAGTAATGATTGTATGTCATGCTTTTCTGTGATGAGCCTTGATGTTTTTAACA 

, 1 1 1 1 1 

TCTTGTTGTCATTACTAACATACAGTACGAAAAGACACTACTCGGAACTACAAAAATTGT 

GAATTTGATAGTTTGAGACATAAAAATTTTTGAAAAACCAACTCACGTTTCCAAACAAAA 

1 1 1 -« • ' 

CTTAAACTATCAAACTCTGTATTTTTAAAAACTTTTTCGTTGAGTGCAAAGGTTTGTTTT 

GTGAAAAGGAGAGCTGGATTCGAGCCCCGCCACCCCAGCAAATTGAGAAATCACCCCtTG 

1 —I 1 1 ^1 ' 

CACTTTTCCTCTCGACCTAAGGTCGGGGCGGTGGGGTCGTTTAACTCTTTAGTGGGGAAC 

CTGTGAGTCAGTGGGT 3' 

I MINISATELLITE 

GACACTCAGTCACCCA 5' 



50 
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pMSl.lb 3' flanking sequence, starting at the Eco Rv s ite about 50bp 
from the end of mlnlsatelllte 



EcoRV 
I 

5 ' TGGATATCTCATGCATGGGAGCCCAGATTGGGTGGTCCTGGCGCTCATGGGTTGCATATG 

1 1 1 1 1 1 

3 ' ACCTATAGAGTACGTACCCTCGGGTCTAACCCACCAGGACCGGGAGTACCCAACGTATAC 

CTTCTTTGCTGCCTGTGCTGTAAGAACCATGCTAAGAACAGACCTGTGCCCCTGGGCTCt 

1 1 1 1 1 1 

GAAGAAACTACGGACACGACATTCTTGGTACGATTCTTGTCTGGACACGGGGACCCGAGA 

CAGCAGTGCATGTGCCAGGGAGCTGTGGGTGTGGAAGCAGCAGGAAAAACAGCTGTGCCA 

, 1 1 1 1 1 

GTCGTCACGTACACGGTCCCTCGACACCCACACCTTCGTCGTCCTTTTTGTCGACACGGT 

GGTCGGGGAGGGGAAGGGACAGAGAACCAGGGGCAGGGAAAGAGTTCCAAGCTCCTTGGC 

, 1 1 —I 1 1 

CCAGCCCCTCCCCTTCCCTGTCTCTTGGTCCCCGTCCCTTTCTCAAGGTTCGAGGAACCG 

CACCGNACCATGTGCTCATGGCCTCCTCCCCGGCTGTCAGTTTCCCTATCTGTAAAATGA 3 • 

1 -I 1 1 1 1 

GTGGCNTGGTACACGAGTACCGGAGGAGGGGCCGACAGTCAAAGGGATAGACATTTTACT 5 • 
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The minisatellite repeat sequence in clone XMS29. The region m the 39bp consensus repeat sequence 
similar to the 3' end of the minisatellite core sequence (GGAGGTGQGCAQGARG, Jeffreys e^ al.. 1985 ) .s 
marked with asterisks. Deviations from the consensus sequence are shown for two blocks of 4 contiguous 
repeat units (a-d. e-h) cloned at random from XMS29. The 39bp repeat sequence consists of 13 diverged 
repeats of PyAQ. 



consensus 



10 



TTGCA6TAGCTGTGGCAGGAGGAGTAGCAGCATCAGCAG 
V — ). — ► — > — > — > — > — > — > — > — > > — ^ 



{YAG)i3 



15 



20 




pMsr^l 5' flanking sequenc e Attending Into minisatellite 

5- GATCCACTCGGAACCACCT GCACTTAGGAGCAAGCCTAGAATGTTCTGGAAGGATTGAAG 
3 . CTAGGTGAcicTTGGTGGAiGTCAATCCTCGTTCGGATCrTACAAGACCrTCCTAACrTC 

C(y^GCCTTGTCTGAGGCCCTGGGAAAGTCGCC^^ 

GGTCGGAAciGACTCCGGGicCCTTTCAciGGACCTGTAicCCTACACCGACCTCCTGGG 
GAGGAAGATGCTGAACTCCTGTGTGA^ 

CTCCTTCTACGACTTCAGGicACACTCCGGGCCAGACCCTCGGTGCCGGGGAGGGGGTGA 
CAGTCCGGCCTGCTGGGOTTTCCT^^ 

GTCAGGCCG^ACGACCCCAiAGGACGGCciGGAGGGAGTiTCGGGTGCCGAGGGGTCCAC 
GCTCTGGCCCGGGAGCCACAGGCA^^ 

^^I^I^^G^CCCTCGGTGicCGTGTTGGiTCCGTCCCCTTCGGCGTACGT^^^^ 
CTCTTCCCTTTGCACGCTCGACGG^ 

^I^II^^^CGTGCGAciTGCCACCGcLuUCCGGAAAicGGAACCCGAGTCCTCCCC^ 

TGGGGGGTCAGGAGGG^ 3' ^j^g^^g^LITE 

ACCCCCCAGTCCTCCCCGGTACTTCCCCTGGACCGGAACC 5 ' 
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FLANKING SEQUENCE 3* TO THE MS31. LOCUS 



MINISATELLITB 



UT»rr«!ATPTTTTE 5> CCGGCCGG ATGGGCGTGT GGGGACGGTG TGCCGGTGTG GGGACGGGGT 
MINISATELLITE 5^ ^CGGCCGG ^^^^^^^^ CCCCTGCCAC ACGGCCACAC CCCTGCCCCA 

rriCGTGTGG GGACGGGGTG CAGGTGTGGG GACGGGGTGC AGGTGTGGGG ACGGCGTGCA 
?GTcSS?c SctgcS GTCCACACCC CTGCCCCACG TCCACACCCC TGCCGCACGT 

GGTGTGGGGA CGGGGTGCAG GTGTGGGGAC GGCGTGCTGT GGGGATC 3' 
CCACACCCCT GCCCCACGTC CACACCCGTG GCGCACGACA CCCCTAG 5 



5' flanking sequence of pMS32.6 extendi ng Into mlnlsatelllte 

5 • GATCACCGGTGAATTCCACAGACACTTAAAAGCAAAAATAATAATTTGTTGAATACAGTG 
, 1 1 1 1 ' 



3' CTA 



GTGGCCACTTAAGGTGTCTGTGAATTTTCGTTTTTATTATTAAACAACTTATGTCAC 



AGTTCTAAATTTCTCTTCAAAGAATCAGTATGTCAGTATGTTCAGTTCTTTGT TCTCCAT 

_ 1 1 1 1— ■ • ' 

TGAAGATTTAAAGAGAAGTTTCTTAGTCATACAGTCATAGGAGTCAAGAAAGCCGAGGTA 

TTTAAAGTTGAACTTCCTCGTTCTCCTCAGCCCTAGTTTCACTAAACAACCTTTTCCAAC 3 ' 

J 1 1 -I— • ' 

AAATTTCAACTTGAAGGAGCAAGAGGAGTCGGGATCAAAGTGATTTGTTGGAAAAGGTTG 5 ' 

HINISATELLITE 
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pHS32.6 3' flanking sequence ex tending from mlnlsatelllte 

5' AACCACCCTTCCCAC 

mHISATELLITE ' "T"' 

3' TTGGTGGGAAGGGTG 

nAAACTACTCACCCCGCCACTCTGGCrCATACCCCTGCTCTCTTTAAAGT AGCCAATCGG 

I 1 1 — I— 1 ~' 

GTTTGATGAGTGGGGCGGTGAGACCGAGTATGGGGACGAGAGAAATTTCATCGGTTAGCC 

AATTAGCTTAGACTGTGCAGTCCAACCCTAGCCGATACGGGAACGACGCATCAGTAGGGG 

I 1 1 -I—- ~ ' 

TTAATCGAATCTGACACGTCAGGTTGGGATCGGCTATGCCCTTGCTGCGTAGTCATCCCC 

CTACCTGTGTCAGGAATCAGAACCCCTTCCCCTCCCTTGTTCAGGTGTGC TCTGGCCATT 

I 1- 1 ^1 ' 

GATGGACACAGTCCTTAGTCTTGGGGAAGGGGAGGGAACAAGTCCACACGAGACCGGTAA 

GCTCCATCTGCGAGTCGCACCCTTCTAGA GAAGTAAAATTGCCTTGCTGAGAAAATTAAA . 
CGAGGTAGACGCTCAGCGTGGGAAGATCTCTTCATTTTAACGGAACGACTCTTTTAATTT 
CTTATGTTTGAGTGGTATTTCTTTTGCAGCACCAAAAAT TTATTTACAACAAATTCTACA 
GAATACAAACTCACCATAAAGAAAACGTCGTGGTTTTTAAATAAATGTTGTTTAAGATGT 
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Flanking sequences of 33^1 



TGTAC^CTCC CAGTGCCCGA GAGGATGGGG TTGAGAGCTA AGCTGAGAAA GTCTATCTCT 
IcIt^^SgG GTCACGGGCT CTCCTACCCC AACTCTCGAT TCGACTOrXT CAGATAGAGA 

GAATTTCANT GCTGATAAAA ACAACCGAGC CTGCCGGGGC AGGGGCTTGC TTT CTCCACG 

otISg^na Sactatttt tgttggctcg gacggccccg tccccgaacg aaagaggtgc 



GATGGGATGC CACA ACTGC 
CTACCCTACG GTGTTGACG 



MINISATELLITE 



GGGTGACACG GCACGGTGGC 
CCCACTGTGC CGT GCCACCG 

AGTCTTTACA GAGTCTTGTC 
TCAGAAATGT CTCAGAACAG 

TTCCTGTGGA 3' 
AAGGACACCT 5' 



TTTCCTGAA CATCGTACCC 
AAAGGACTT GTAGCATGGG 

TTAGGGACAC AGATGATAGA 
AATCCCTGTG TCTACTATCT 

AAGCCCTCAG GTCTGCGCAC 
TTCGGGAGTC CAGACGCGTG 



CCCACCCCCC AGAAGCTTGT 
nnrrvnaa aGG TCTTCGAACA 

AGCTGGCTGC ATAAAAGACA 
TCGACCGACG TATTTTCTGT 

GTCTGATCTA ACCAAAGTTT 
CAGACTAGAT TGGTTTCAAA 
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Planking , sequence s of 33.4 



T= S foxS S= c= 

5S S S= ™= ™ 
=^ gg^S Sg= =S 



MIHISATELLITE 



s= s^??^ =?s s= 

= 1=0 ?= i?Sf.1^JSSJ= 

S= =S = oTcS I^^oS^ 

1= i: 



Flanking sequence s of 33,6 



^^.....r.. nnAr,ACCTCA CATTTGACCT TGGMAGT 
ICACTCA-TCT CCTCTGGAGT GTAAACTGGA ACCTTTCA 

MINISATELLITE 

?src = = s= =1 

CTTGGATTGA GTAATGTCTC ACCTGCTTGC 3' 
r.AArnTAACr r trrKCXGLG TGGACGAACG 5 
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BLANKING SEQUENCES OF THE MS51 LOCOS 

^^^m^Wii^ 1= =s " 
= — = — 



GGGCTGGA.GG GAC 
CCCGACCTCC CTG 

MINISATELLITE 



S ^= =1 = 

IS = =fc = = 

CCAAAGGCCA CCTGCCCACA CTGGCACCGA ATTC 3' 
ggStccggt GGACGGGTGT GACCGTGGCT TAAG 5 

MS51- Sequence across the adjacent Hinf I site 

1 AGTGGATTTT CCCACTTCCC TCTGTTGTCT ATAATGAATG ACCACTACAT 

51- TTTTAACCCC AAAAGCTTCT CATATGAAGA GGAGACTTCT TTCAGCAGCA 

101 GGCAGGGCAC CTtCTCTGAA GCTCGCCAGC AAGCACCAAC CGACTCCCTG 

151 GCCACGGCAG CTGACGCCCA GGCAGTGACC CGACACGGGA GGGCTCTCGA 

201 AAAGRCTGGA GCTCCCATGA C 
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yianklng seque"r«^s of the hvpervarlable 'region 5' 
rn the human Insulin gene 
(AM. J. HUM. GENET., 1986, 39, 291-229.) 

crrCQGCrGC TGTCCTAAGG CAGGGTGGGA ACTAGGCAGC CAGCAGGGAG GGGACCCCTC 
3- gI?Sg TckIZ^cc GTCCCACCCT TGATCCGTCG GTCGTCCCTC CCCTGGGGAG 

CCTCACTCCC ACTCTCCCAC CCCCACCACC TTGGCCCATC CATGGCGGCA TCTTGGGCpA 

cSGmcGG tSgagggtg ggggtggtgg aaccgggtag gtaccgccgt agaacccggt 

TCCGGGACTG GGG MINISATELLITE ^CAG 

AGGCCCTGAC CCC 

CAGCGCAAAG AGCCCCGCCC TGCAGCCTCC AGCTCTCCTG GTCTAATGTG GAAACTGGCC 
GTCGCG^^^C ^CGGGGCGGG ACGTCGGAGG TCGAGAGGAC CAGATTACAC CTTTCACCGG 

CAGGTGAGGG CTTTGCTCTC CTGGAGACAT TTGCCCCCAG 3* 
GTCCACTCCC GAAACGAGAG GACCTCTGTA AACGGGGGTC 5' 



25 FLANKING SEQUEHCES OP 5» ALPHA GLOBIN HVR 

(AM. J. HUM. GENET.. 1988. 43. 249-256) 



,^nnn UTMT«5ATT1.1.ITE CTGTGCGGG AAGCCCGAAA TCCTTA 3' 

r. SSIg looo "^^^^"^"'^ g'Igagggcg ttcgggcttt aggaat 5. 



ELANKING SEQUENCES OF THE ALPHA GLOBIN 3' HVR 
(EMBO. 1986. 5. 1957-1863) 



5' AGTCCCACCT GCAGGAAAAG GTGCAGGTAA GG 
3' TCAGGGTGGA CGTCCTTTTC CACGTCCATT CC 

HINISATELLITK 



AGGAACAGCA ACACGCAGGG AATATCGACA TGGGTGATGC 
TCCTTGTCGT TGTGCGTCCC TTATAGCTGT ACCCACTACG 



55 CTGCAAAGCA CAGCATCAAT CCCAAGTGAC TGA 3| 

GACGTTTCGT GTCGTAGTTA GGGTTCACTG ACT 3 
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15 



25 



30 



.i.nvin. ..onence . »^ >.vn.rv.rlabl. re^t^n at the Zeta ^lobln locus 
^ (PNAS, 1983, 80, 5022-5026 ) 



ACACCCATCA ATGGGAGCAC CAGGACAGACAT GGAGGCTAAT GTCATGTTGT AGACAGGAT 
3. ^ctSgtIgt TACCCTCGTG GTCCTGTCTGTA CCTCCGATTA CAGTACAACA TCTGTCCTA 

rrTrrrrAGC TGACCACACC CACATTATTA GAAAATAACA GCACAGGCTT GGGGTGGAGG 
cSS^CG IcTGG^G^G GTGTAATAAT CTTTTATTGT CGTGTCCGAA CCCCACCTCC 

CGGGACACAA GACTAGCCAG AAGGAGAAAG AAAGGTGAAA AGCTGTTGGT GCAAGGAAGC 
GCCCTGTGTT CTGATCGGTC TTCCTCTTTC TTTCCACTTT TCGACAACCA CGTTCCTTCG 

TCTTGGTATT TTCAACGGCT MINISATELLITE AGO 

AGAACCATAA AAGTTGCCGA 

20 TACAGGGAGA AAAGACTTGG TGCTGTGGGC CTGCCTTGGG GCTGGTGGTA CAGCCCTTAT 

ItWcStCT m^^SIcC ACGACACCCG GACGGAACCC CGACCACCAT GTCGGGAATA 

rTGCTGCCCT CAGGATCTCC CGGCCCCTCT CGTCCAGGCC CCTGCAACCC CATGCCCCAG 

ScgS SgIgg gccggggaga gcaggtccgg GGACGTTGGG gtacggggtc 

CCTCTGAGGA CCAAAGGCGC CCCTGCTTGG GAAGAGGGGG CTCAGGGGAG TCGCCTGACC 
gSgIctCCT GG^^^CCGCG GGGACGAACC CTTCTCCCCC GAGTCCCCTC AGCGGACIGG 

CGGTTCCAAG CCAGGCTGAT TTACCGTTGT TAACATCCTA GTGCACGCAT CCCTCTGCCT 
Ifci^G^c ggSta AATGGCAACA ATTGTAGGAT CACGTGCGTA GGGAGACGGA 



CATGCACCCA ACTCCAAGGC CTGGTACAC 3 ' 
GTACGTGGGT TGAGGTTCCG GACCATGTG 5' 
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FLANKING SEQUENCES OF HA. RAS LOCUS 
(NATURE, 302, 33-37) 



TGCTCCAAGG GGCTTCCCCT GCCTTGGGCC AAGTTCTAGG TCTGGCCACA GCGACAGACA 
ACGAGGTTCC CCGAACGGGA CGGAACCCGG TTCAAGATCC AGACCGGTGT CGGTGTCTGT 

GCTCAGTCCC CTGTGTGGTC ATCCTGGCTT CTGCTGGGGG CCCACAGCGC CCCTGGTGCC 
CGAGTCAGGG GACACACCAG TAGGACCGAA GACGACCCCC GGGTGTCGCG GGGACCACGG 

CCTCCCCTCC CAGGGCCCGG GTTGAGGCTG GGCCAGGCCT CTGGGACGGG GACTTGTGCC 
GGAGGGGAGG GTCCCGGGCC CAACTCCGAC CCGGTCCGGA GACCCTGCCC CTGAACACGG 

CTGTCAGGGT TCCCTATCCC TGAGGTTGGG GGAGAGCTAG CAGGGCATGC CGCTGGCTGG 
GACAGTCCCA AGGGATAGGG ACTCCAACCC CCTCTCGATC GTCCCGTACG GCGACCGACC 

CCAGGGCTGC AGGGAC MINISATELLITE 
GGTCCCGACG TCCCTG 

GAGTGACCAG CTTCCCCATC GATAGACTTC CCGAGGCCAG GAGCCCTCTA GGGCTGCCGG 
CTCACTGGTC GAAGGGGTAG CTATCTGAAG GGCTCCGGTC CTCGGGAGAT CCCGACGGCC 

GTGCCACCCT GGCTCCTTCC ACACCGTGCT GGTCACTGCC TGCTGGG6GC GTCAGATGCA 
CACGGTGGGA CCGAGGAAGG TGTGGCACGA CCAGTGACGG ACGACCCCCG CAGTCTACGT 

GGTGACCCTG TGC 3' 
CCACTGGGAC ACG 5' 
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Additional Single Locus Probes Cloned by Ray White 












h 


leterozyj 


josity 


D1754 

D17526 

D17528 

D75396 

D13552 

D14523 


pCMMSe 

EFD52 

pYNH37.3 

pJCZ67 

CMHZ47 

CKKA39 


Myoglobin 
HBV-4 
HBV-2 
Zeta globin 
YNZ22 


Hint! 
Mspr 

Mspr 
Mspr 
Mspr 
Mspr 


>10 
>8 
5 
4 

>10 
>10 


5-1 .3kb 
5-1 Okb 

2- 4kb 

3- 6kb 
1.5-3.0kb 

2-4.5kb 


90% 
83% 
78% 
80% 
80% 
83% 


NAR 16.5223 
NAR 16.786 
NAR 16.782 
NAR 16.4191 
NAR 16,3119 
NAR 16.3120 



* Use of Hinfl not Investigated 
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The Invention will now be illustrated but not limited by reference to the following: 



25 Example 1 



MATERIALS AND METHODS 



30 



35 



40 



(a) Preparation of genomic DNA, oligonucleotides and hybridization probes 

Human DNA samples were provided by CEPH. Paris, or were prepared from venous blood as ^ 
elsewhere [26] Oligonucleotides synthesised on an ABI 3808 DNA synthesiser using reagents supphed by 
Suachem w^^^^^^ by ethanol precipitation and dissolved in water The 5^6 kb |^3A Jns^^^^ 
human minisatellite clone MS32 [10] was subcloned into the BamHI site of pUCl3 tf ^ ^'^J^ 
mrsateliite inserts from recombinant M13 RF DNAs 331. 33.4 and ^^^^^^^^^^ 
BamHI-EcoRI fraoment. a 2.7 kb Sau3A-EcoRI fragment and a 0.7 kb BamHI-EcoRI respectively, ana 
Sni^rntJ^^^^^^ withBamRTplus EcoRI. to produce the plasmid series P33.1 p33.4 an^^ 

efec'^^^^^^^^^ 1% iow gelling temperature agarose (Sea A T^^^^^^^^ 

fragments were dissolved in water at 65° to a final concentration of 2ug/ml DNA. 10ng al.quots of DNA were 

labelled with by random oligonucleotide priming [28]. 
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(b) Polymerase chain reaction 



55 step of 1 

phase at 70° for 15 minutes. 
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(c) Southern blot analysis of PGR products 

Paraffin oil was removed fronn PGR reactions by extraction with diethyl ether. Agarose gel elec- 
trophoresis of PGR products, Southern blotting onto Hybond N (Amersham) and hybridization with 2^''- 
5 labelled minisatellite probes were carried out as described previously [10], except that competitor human 
DNA was omitted from all hybridizations. Restriction digests and SI nuclease digestion of PGR products 
were performed by diluting 5ul PGR reaction with 25m restriction endonuclease or S1 nuclease buffer [29] 
and digesting for 30 minutes at 37"* with 3 units restriction endonuclease of S1 nuclease (BRL) prior to gel 
electrophoresis. 



(d) Isolation and PGR analysis of single human cells 

Lymphocytes were isolated by diluting venous blood with a equal volume of 1xSSG (saline sodium 
75 citrate. 0.15M NaCI. 15mM trisodium citrate, pH 7,0). layering over Histopaque-1119 (Sigma) and cen- 
trifuging at 2000g for 10 min. Gells at the interface were diluted with 3 vol IxSSG and banded again over 
Histopaque. Gells were pelleted by centrifuging at 2000g for 10 minutes, washed three times with 1xSSC, 
with centrifugation, and resuspended in IxSSC to 10* cells/ml. 

Buccal cells were isolated by diluting 0.5ml saliva with 5ml 1xSSG and centrifuging at 2000g for 10 
20 minutes. The cell pellet was rinsed three times with 1xSSG and resuspended to 10* cells/ml. 

Approximately O.mi aliquots of the cell suspensions were pipetted onto a siliconised microscope slide 
and rapidly examined at lOOx magnification on an inverted microscope. Droplets containing a single 
nucleated cell were immediately diluted with 0.4ul 1xSSG and transferred to an Eppendorf tube using a 
disposable tip pipette. The microscope slide was re-examined to check that the cell had been removed with 
25 the droplet. . 

Gells were lysed prior to PGR either by heating or by treatment with sodium dodecyl sulphate (SDS) 
and proteinase K [23]. In the former case, the cell droplet was diluted with 4.5ul 5mf^ Tris-GHI (pH7.5) 
containing 0.1 nM oligonucleotide primers, overlaid with parafffin oil and heated at 95° for 3 minutes prior to 
the addition of 5ii\ 2x concentrated PGR buffer/primers/Taq polymerase and amplification. In the latter case, 
30 the cell droplet was mixed with 0.5m 5mM Tris-HG! {pH7.5), O.luM primers plus 1 J 6rr\M Tris-HGI 
(pH7.5), 40mM dithiothreithol. 3.4 uM SDS. 50 ug/ml proteinase K [23], overlaid with paraffin oil and 
digested at 37° for 45 minutes. 3ixl water were added to the digest, and heated at 95° for 3 minutes to 
inactivate proteinase K prior to addition of 5 ul 2x PGR reaction mix as above. 



35 

(a) Selection of human minisatellites for amplification by PGR 

The strategy for amplifying minisatellites is shown in Figure 1. Oligonucleotide primers corresponding to 
unique sequence DNA flanking the minisatellite are used to drive amplification of the entire minisatellite by 

40 Taq polymerase. Amplified alleles are detected by Southern blot hybridization with a minisatellite probe 
located internal to the priming sites. Six cloned minisatellites were chosen for study (Table 1). Two of them. 
pxg3 and MS32 [8. 10]. detect highly variable loci with heterozygosities of 97% and more than 40 alleles 
varying in the number of repeat units. The other four minisatellites. 33.1, 33.4 and 33.6 [5] and pMS5l, 
isolated as a Sau3A-EcoRI DNA fragment cloned from a DNA fingerprint (A J Jeffreys, unpublished data). 

45 detect much liiTvariabie loci with heterozygosities of 66-77%; the alleles are however shorter than those of 
pxg3 and MS32 (Table 1) and should be more amenable to amplification by PGR. The flanking sequences 
of pXg3 33.1. 33.4 and 33.6 have been described previously [5. 8]: the flanking DNA of XMS32 and pMS51 
was sequenced as described before [8]. All flanking DNA sequences were screened against the EMBL DNA 
sequence database to identify repeat elements such as AIu. and PGR oligonucleotide primers A and B 

50 (Figure 1) were designed to avoid such elements. Details of all primers and hybridization probes are given 
In Figure 1 legend. 



(b) Fidelity and efficiency of PGR amplification of human minisatellite alleles 

To determine the ability of Taq polymerase to amplify in particular long minisatellite alleles, a nrjixture of 
Omg genomic DNA from each of 4 individuals, giving a total of 8 different MS32 alleles ranging In length 
from 1 1 to 17.9 kb. was amplified for 10-20 cycles using XMS32 flanking primers A and B. followed by 
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Southern blot hybridization with a minisatellite probe (Figure 2A). Using 6 minutes extension times for Taq 
Dolymerase. only the four shortest alleles (1.1-2.9 kb) were efficiently amplified. Increasing the extension 
time to 15 minutes, to improve the chance that the Taq polymerase would progress completely across the 
minisatellite, gave a marked increase in yield of the next two larger alleles (4.5, 6.6 kb) though no urther 
Improvement was seen with 30 minutes extensions. The relative yield of large alleys could also be 
improved by increasing the concentration of Taq polymerase (Figure 2B). allowing the detection of a 10.2 
kb allele, albeit faintly. Addition of extra Taq polymerase at the 13th cycle gave only a marginal 
improvement In yield, and there is no evidence for a significant drop in polymerase activity during these 
prolonged extension times. Further experiments varying annealing temperature, extension temperature and 
buffer concentration failed to improve the yield of large alleles (data not shown), and all further e><P«;"^«"2 
used 15 minutes extension times and high concentration of Taq polymerase (1.5 units per lOul PGR 

'^^^'^At low cycle numbers (10 cycles), the alleles amplified appear to be completely faithful copies of the 
startingXMS32 alleles, as judged by their electrophoretic mobilities (Rgure 2A). At higher cycle numbers 
(14 17 cycles), there is an increase in background labellings; since most of this can be eliminated by 
digUtion with SI nuclease (data not shown), much of this background probably arises from low levels of 
Single-stranded templates from the previous cycle which have failed to pnme. and from Incomplete 
extension products from the previous cycles which by definition cannot pnme. At high cycle nunnbers (20^ 
the hybridization pattern degenerates to a heterodisperse smear, as expected since the yield of PGR 
product becomes so high (>400ng/ml) that out-of-register annealing of single-stranded tandem-repeated 
minisatellite DNA will occur during the extension phase. This will lead to premature termination of extension 
at a reannealed site, to spurious "alleles" arising from the extension of Incomplete templates annealed out- 
of-re Jster to the complementary strand of a minisatellite. and to the formation of multimolecular networks of 

reannealed ministaellite DNA strands. ^ ^ ■ „ ^.,„»i»«.r,otrx, /RmirA 

The yields of each XMS32 allele amplified by PGR were quantified by scanning densitometry (Figure 
3). PGR products from 0.1 ug genomic DNA accumulate exponentially at least up to cycle 17. 

The gain in product per cycle decreases monotonously with allele length, with lower gams for 6 minutes 
comp^efwitr 15 minutes eJension times. The gain versus allele length curves extrapo ate back to a gain 
orcvde orapproximately 2.0 for very short alleles, indicating that the efficiency of denaturation and 
Mna at each cSTclose to 100%. Final yields of an allele can be calculated from these curves: for an 
a Z a w th gain Ta per cycle present initially at n molecules, the yield after c cycles is approximately n.g.« 
SclrTSe mo^Mmbalance between alleles A and B of different lengths, arising through more efficient 
SfSon Of Thorter alleles, is given by (g./ge)". For example, after ^V^las of ^P'^-f-jJ^^^^ 
minutes extension times, the molar yield of a 1 kb allele will be 18 times higher than that of a 6 kb allele. 
r? 25 cSs tSe SalL^^ will be 1300.fold. This imbalance Is ameliorated to some extent by the more 
efficent detSrrionger alleles by the minisatellite hybridization probe. Nevertheless, 'ong alleles 
Sl^d by PGR will become increasingly difficult to detect with low amounts of starting human genomic 

^'^MinfsSS^^^^^^^^^ PMS51. 3^!! 33.4 and 33.6 were also tested for their abliity to be amplified by PGR 
(data ^ot Swn) In all" cases, aithfu amplification of all alleles tested was observed except for the longest 
Tskb) IlSrof pxgl which as expected failed to amplify. Again, yields of PGR product fell with increasing 
allele length. 



45 (c) Fidelity of amplification of single minisatellite molecul^ 

predicted since epg t,.nnan DNA «ill ""^..^S^S. by Pc" No%,>M>^ emplincfion , 
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and 6pg DNA samples (Figures 4A.B). S1 nuclease digestion eliminated one band (Figure 4B) which 
appears only occasionally, which comlgrates with denatured PGR product and- which can be arQely 
eliminated by chasing the PGR products by a final annealing/extension step (data not shown, see Matenal 
and Methods). This S1 nuclease-sensltive band presumably corresponds to single-stranded template which 
failed to prime In the final PGR cycle. The remaining spurious bands detected by XMS32 were resistant to 
S1 nuclease but were reduced in size, along with the correct PGR product, by digestion with restnction 
endonucieases which cleave DNA flanking the minlsatelllte (Figure 4B). These spurious bands presumably 
represent abnormal PGR products with normal flanking DNA but altered numbers of minisatellite repeat 
units They are particularly prominent after 30 cycles of amplification, are generally present in low. amounts 
compared in the authentic allele, and vary in length from reaction to reaction, in contrast to the parent allele. 
They are not the result of contamination of the PGR reactions with human DNA or with products of previous 
PGR reactions, since they only appear in reactions where succesful amplification of an authentic allele has 
occurred (Figure 4A) and have been consistently seen with all human DNAs tested (data not shown). Since 
almost all of the spurious products are shorter than the authentic allele. It is likely that they arise fairiy early 
in the PGR reaction and accumulate preferentially due to their short length and concommitant higher 
efficiency of amplification. It is not yet clear how these "mutant" alleles arise, nor whether PGR conditions 
can be found which will suppress their appearance. A similar frequency of appearance of abnormal alleles 
has been seen with pxg3. but only much less frequently with the other four mlnisatellites tested. 

Somatic mutations at ministatellite loci in the starting human genomic DNA could also be a source of 
unexpected PGR products. Such mutations do exist, particularly for XMS32. as shown by the appearance of 
mutant minisatelllite alleles in clonal tumour cell populations [30]. Somatic mutants are however unlikeh^ o 
be a major source of the spurious bands shown in Rgure 4. since no PGR reactions on 6pg human DNA 
have yet been seen which show a mutant allele appearing in the absence of the normal parental allele. 
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(d) Go-amplification of multiple mlnisatellites: PGR-derived human DNA fingerprints 

Figure 4 demonstrates that two mlnisatellites can be successfully co-amplified in the same PGR 
reaction. Further analyses showed that at least six minisatellites could be co-amplified without any apparent 
interference between loci. Furthermore, the PGR products could also be typed ^ou*^^^^ 
biS hybridization with a cocktail of all six minsatellite probes. Examples of such mult ocus PCR-dB"ved 
DNA "fingerprints" are shown In Rgure 5. In all cases, the PGR reaction was limited to 15-18 cycles to 
minimise the appearance of spurious products as seen in Rgure 4. Repeat analyses of the same individual 
shZd that the pattem was reproducible, with all hybridizing DNA fragments representing au» 
mrratellite amplification products. DNA "fingerprints" could be readily derived from ^"9 J"X?ed 33 4 
occasion, one or two loci failed to amplify (individuals 1. 12. Rgure 5A. G); this failure usually affected 33.4. 
Sed by PMS51 . and was least likely to affect 33.1 (data not shown). The likelihood of failure appears to 
correlate with the QC content of the minlsatelllte repeat units (Table 1). and suggests tha non-amph ication 
SSs from failure to denature GG-rich minisatellites at 95°. probably due to localised variations of 
temperature In the heating block or to poor thermal conductivity between tiie block and the .reaction tube. 

These PGR DNA fingerprints are derived from six loci with widely differing levels of variability (Table 1). 
To determine the overafi complexity and level of variability of these patterns unreined Individuals were 
compared (Rgure 5B). On average. 8.9 bands were resolved per individual (range 6-11 Figure 6) The 
m^Surn possible number of bands is 12 (Rgure 5A). corresponding to heterozygosity at all loci, with no 
SeTophoretic comigration of alleles from different loci and wlth-no alleles too '^^ *° 
PGR in paln^ise comparisons of unrelated Individuals, there are on average 10.8 bands wh^ch a e 
dSrdant between pairs of individuals (range 5-18). Since ttie distribution of discordancies approximates to 
rPo^sS diiribution. then tine chance that two unrelated individuals would show Identical DNA fingerpnnts 
a poisson aisiriDuiion. «\^n .^^ .g patterns ttierefore show a good degree 

lj£Ta£TT^fierLes in PGR-derived DNA fingerprints are also readily detectatje between 
cfol^r^ela^d tnditiduLls in for example the 3-generatlon family shown In Rgure 5C in which faithful 
transmission of bands from parent to offspring can also be seen. 

(e) Typing minisatellites in single human cells 

cells can be subjected directly to PGR without the need tor purifying DNA [23]. which greatly 
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simolifiBS the analysis of specimens such as blood, in preliminary experimaents It proved possible to type 
Sa Sites in btood by first freezing and thawing blood to lyse erythrocytes, followed by centrrfuga .on to 
colect whS ce^and nuclei, and heating in-water to lyse cells prior to PCR. 9 this njethod .t was 
^ncSirr ranmduciblv tvoe 0 001-0.01 til blood, corresponding to 3-30 nucleated ceils (data not shown). 
? slnals"^^^^^^^^ •yrnphocytes (Figure 7A) from which ['ve -ini^^^^^^^^^^^^^^ 

simultaneously co-amplifled and typed by sequential hybridization. Successful and reasonably faithful 

br^S in water. Individual nucleated buccal cells could also be typed followmg proteinase K/SDS lys.s 
Fiaure 7B). though these cells failed to lyse in water (data not shown). . ^. , ^ „«..,„»ai„ 

To tes the feasibility of identifying individual cells. 14 buccal cells from two individuals were separately 
type?in tindedCe^^ (Figure 7B). In four cases, no amplification products were J~"^ ^^^^^^^ 
S^e IOC suggesting eUher that the cell had not been transferred to the PCR reaction, or that lys's had not 
occu?re^ of ^at nuclear DMA had degraded prior to PCR. In the remaining 10 f 
could be detected from at least two of the minisatellite loci, and m some cases all five loci amphfied 
JucLlrfrom a single cell. Omitting the large alleles of MS32. which amplify poorly and would be 
dScult to type at the single cell level, we estimate that, for those single cell PCR reactioris m which at least 
rme ocl have amplified' approximately 75o/. of alleles present could be dete^ed ^a/f^/^- 
estimate aarees with the efficiency of single molecule amplification determined from PCR analysis of 6pg 
Snro? human D^^^^^^ 4) As expected from Figure 4. several Instances of spurious bands were 
Sen n both buTccal ce and lymphocyte PCR reactions (Figure 7A.B). Nevertheless J'^"9"^J;'"9^"^^^ 
Tom each Of the two individuals tested could be detected in the 10 successfully-typed buccal cells, and the 
r^rintn nf Aarh huccal cbII was successfullv predicted in this blinded trial. 

^ Finlrl no?e tSe pT^in^^^ of ampliL products of 33.6 in some of the DNA-free controls m Figure 
7A B S 33 4Tone o the buccal cell controls. In practice, we have found that such contamination 
pfobLy with';^^^^^^^^^ DNAs or the products of previous PCR reactions ra^er^^^^^^ wrth ^^'^^^^or 

which have been in continuous use in our laboratory for the last four years. 



(f) CONCLUSIONS 

The m,nisa«te prob» a» «ns,«v, a.d can 

^ » preSct th, ■a.mbor o, PCR cyctea needed to, ^^^^^"l^^ 'p^R S^Z 

s"i;ser:orrr«e7^^^^^^^^ 

quanma«,»ly to .Mlmale low concenWton, ■^JT^^^'^S^'m^^ ^ " "° 

polyme,^ will rot be limiting dunng *"X"To^ a" « eiSrent ■nlnleatelll.e. can be co- 

'-'?:rSSon =, .InlaatslMea «~r'=n" SSld^'S" — ^ 
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down to 1ng human DNA. Information can also be recovered from much lower amounts of DNA and single 
cells, although the generation of spurious DNA fragments at some loci, during the relatively large number of 
PGR cycies needed for single cell typing, could present significant problems for individual identification at 
the level of one or a few cells. Fortunately, these spurious PGR products appear to vary from reaction to 

5 reaction, and duplicate PGR analyses of very small samples of DNA should therefore distinguish bona fide 
amplified alleles from spurious PGR products. 

PGR-derived DNA fingerprints already show a good ievel of individual specificity, with a chance of false 
association of two Individuals of approximately 2 x lO'Mn contrast, conventional DNA fingerprints obtained 
by Southern blot hybridization with a multilocus polycore probe [6] or with a cocl<tail of locus-specific 

10 human minisatellite probes [10] show much higher levels of individual specificity {<10"^2 gnci <10"S 
respectively). However, the variability of PGR-derived DNA fingerprints could be improved substantially. 
First, highly Informative alleles particularly at pXg3 and XMS32 cannot be detected above approximately 8 
l<b (1ng human DNA) or approximately 5 kb (single cell). This problem could be overcome by using highly 
variable minisatellites with a more restricted range of allele lengths. Such loci appear to be scarce since 

15 high levels of variability are usually associated with large numbers of minisatellite repeat units and long 
alleles [5, 8, 10]. Some possibly appropriate loci have however been isolated [9], J.A.LArmour and A. J, 
Jeffreys, unpublished data). Second, the number of minisatellites being amplified simultaneously could be 
increased. Third, loci which are particulariy prone to generate spurious PGR products, such as pXgS and 
MS32. could be identified and avoided. If these goals can be accomplished, then we see no reason why 

20 reliable identification at the single cell level should not be possible, provided that inadvertent contamination 
is avoided and that the potential presence of somatic mutations at hypervariable loci is taken into account 
[30]. 

The use of multilocus DNA fingerprint probes in for example individual identification in forensic 
medicine, paternity testing and monitoring bone marrow transplants is limited by the sensitivity of these 

25 probes which require at least 0.1 -lug human DNA for typing [6]. Similarly, locus-specific minisatellite 
probes can only type down to approximately 50ng human DNA [10]. PGR-derived DNA fingerprinting 

' improves sensitivity by orders of magnitude and can be used to type specimens which are relatively 
intractable by conventional Southern blot hybridization. For example, human hair roots typically contain 10- 
500ng DNA [22] and while approximately 70% of roots can be typed using locus-specific minisatellite 

30 probes (Z. Wong, J.A.L. Armour and A.J. Jeffreys, unpublished data), all hair roots so far tested can be 
typed by PGR-derived DNA fingerprinting (data not shown). Similarly. 0.001 -0.01 ul blood can be typed 
without the need first to purify DNA. Likewise, saliva contains typically 100-1000 nucleated buccal cells per 
ul. and PGR-derived DNA fingerprint analysis of submicrolitre samples of saliva is therefore possible. The 
potential for typing trace amounts of hair, blood, semen, saliva and urine in forensic specimens, including 

35 partially degraded samples, is obvious. The potential for inadvertant contamination of specimens, for 
example with traces of spittle. Is likewise evident. 

PGR-derived DNA fingerprints should eventually become sufficiently individual-specific to provide a 
highly polarised test for establishing parentage in for example paternity disputes. Not only would the need 
to isolate DNA be obviated, but much smaller samples of blood obtained by finger-pricking rather than 

40 venipuncture could be used. Alternatively, the determination of parentage could be based on the analysis of 
saliva. This would avoid the problem of individuals who object to giving blood samples on religious or other 
grounds, and would remove the trauma of taking blood from infants. 
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55 



H2O with SOul 
mercaptoethanol, 
albumin (DNase 
units Taq polym 




36 



EP 0 370 719 A2 



40m paraffin oil, and cycled at 95° for 1.5 minutes. 67" for 1.5 minutes and 70° for 9.9 minutes for 30 
cycles on a Techne Programmable Dri-Block PHC-1. followed by a final chase at 67" for 1.5 minutes and 
70° for 9.9 minutes. Equivalent results were obtained on a Perkln Elmer Cetus DNA Thermal Cycler, cycling 
at 95° for 1 minute. 67° for 1 minute and 70° for 10 minutes. 



Isolation of amplified alleles 

50 1 PCR reactions were mixed with 5_1 loading mix (l2.5% ficoll 400. 0.2% bromophenol blue. 0.2M 
Tris awtate (pH8.3). 0.1M Na acetate. ImM EDTA). loaded onto a 1% agarose gel (Sigma Type I) and 
electrophoresed in the presence of ethidlum bromide. Amplified alleles were visualised using a long wave- 
length u/v wand (or a short wavelength u/v transillumlnator If necessary) and gel slices containing alleles 
were excised. DNA was recovered by electrophoresis onto dialysis membrane, ethanol precipitated and 
dissolved in lOul HzO. 



Amplification 

We have previously described how minisatellite alleles can be amplified using oligonucleotide primers 
corresponding to DNA flanking the minisatellite to drive the amplification of the entire tandem repeat array. 
By limiting the number of PCR cycles and detecting amplified alleles by Southern blot hybridization with a 
minisatellite probe, we showed that faithful amplification was possible, but that at high cycle numbers the 
products collapsed to a heterodlsperse smear of amplified products due to annealing between single- 
stranded tandem-repeated minisatellite DNA fragments. Also, most of the PCR products detectable at high 
cycle numbers on an ethidium bromide stained agarose gel did not correspond to PCR products of the 
minisatellite locus but arose by mispriming at other genomic sites (A.J. Jeffreys and V. Wilson, unpublished 
data) 

By decreasing the ionic strength of the PCR buffer and increasing the PCR annealing ternperature to 
reduce mispriming. as well as changing the source of the Taq polymerase, it has proved PO^S''''^ *° ampfy 
minisatellites to the point where authentic PCR-amplified alleles can be diregtly visualised on an ettiidlum 
stained gel (Rgure 9). Some background smearing which resolves into discrete low molecular weight DNA 
fragmen?s doe? occur, as a result of the large number of PCR cycles needed to generate this amount of 
oroduct which produces, by annealing, heterodlsperse minisatellite products. Nevertheless, authentic alleles 
are cleLTy fde^^^^ and correspond in size to the minisatellite alleles detected by conventional Southern 

blot hybridization of human genomic DNA. , ..^ . , *u if ^.^e 

Z noted previously (Jeffreys et al.. 1988). yields of minisatellite alleles fall with allele length, and I has 
not yet proved possible to amplify-mTnisatellites longer than 6 kb to the point ^l^^^^. ^^^^ f " 
on an etf^idium-stained gel. For 50m PCR reactions containing 0.4ug human genomic DI^A and Pmcess^d 
for 30 cycles, up to lug small (1 kb) alleles can be produced: this yield falls to = Ing for 6 ^^J^^j>^2 
hetei^ygotes with widely differing allele sizes, over-amplification of the smaller alia e usually causes the 
premature collapse of the larger allele and its disappearance from ethidium stained gels. 

To map minisatellite variant repeats (MVR) in minisatellites. the amplified allele .s recovered by 
preparative gel electrophoresis, cleaved with a restriction endonuclease which cuts close to or other 
end of the amplified allele, and end-labelled by fill-in labelling at the cleaved end. Partial d'g^^t^^^^^^^^^ 
end-labelled fragment with appropriate restriction enzymes followed by agarose gel «'«<=t~Pho^^s.s a^^^^^ 
auSioradlography generates a map of the location of internal MVRs. Temninal cleavage can be earned out 
^r^^pZL restriction site lying in flanking DNA within the region being amplified, or aUernatively 
one of *e primers can span a suitable restriction site. The 5 primer for K MS32 5 - 
TCACCGQTQA^TTCCACAGACACT.3'. contains an EcoRl site 9-14nt ^Jthln the prlmer. <=o^;^^p;"f'"9 ° 
an EcoRI site in genomic DNA. In the absence of an appropriate site in DNA flanking the '"■"'^atelhte as is 
the^se with the region 3 to X MS32. a temiinal site can be generated by using a 5 extended 
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Mapping MVRs within amplified MS32 alleles 

Screenino the CEPH panel of families of Mormon. French. Amish and Venezuelan origin has revealed 
Alul alleles of XMS32 ranging from 1.2kb to 20kb long; since Alul alleles are only = 0.2 kb longer than the 
^responding PGR products with the above primers, then Alul alleles longer than 6.2 kb cannot easily be 
mapped internally due to low amplification efficiency. w +h<. Fr-nRi 

internal mapping results for the shortest allele in the CEPH panel are shown n F.gure 10 ^Je 
and Clal labelled amplified allele. Both produce a continous Hinfl ladder and a discontinuous 1^11 ladder 
the lirer being complementary for EcoRI and Clal. A complete map of the location of all M^R^ ^'^^f 
by Haelll can be deduced. Repeat-iHklysIs oTThis allele amplified on separate occasions produced the 
samTmap (data not shown), indicating fidelity of amplification and mapping. 
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Single Molecule MVR Mapping 

We have shown that single minisatellite molecules can be amplified with reasonable efficiency and 
fidelity, although with the occasional appearance of spurious PGR products of incorrect ength (Jeffj-eys ^ 
al 1988). use of the modified PGR conditions described here has almost completely suppressed the 
ibDearance of spurious PGR products even from long alleles (Rgure 12A). To determine the fidelity of 
■Se mo^^^^^^^^^^ mapping. 16 aliquots of 6 pg DNA from CEPH individual 10208 heterozygous for 
Ss q and 4 (Figure 12A) were amplified for 25 cycles with external primers A and B (Figure 1A which 
1 rese^ed exclLely for single molecule analysis to minimize the risk of canry-over contamination, 
southern bio, h^^^^^ with an MS32 minisatellite probe revealed ^^^^^^-^'f^^^^^^^^l!^ 
^noth to allele 1 in 5 samples, and allele 4 In 6 samples. No other amplified products were detected, and 
7a e i zer DNA c^^^^^^^^^ also negative (data not shown). From these data 42% of single minisatel te 
molecules give rise to PGR products, an efficiency similar to that reported previously (Jeffreys et al.. 1988) 
and sSar Tt^^^^ amplification of larger alleles (Figure 12A legend). The 1 alleles amp Ii led 

Sm stgle mdL^^ we^ re-arSplified with the nested primers C1 and D. to minimize subsequent 
contam inaL of single molecule PGR reactons and to eliminate spurious products ^^^"^^^P"^ 
wtth Drimers A and/or B elsewhere in the genome, and subjected to internal mapping. Al maps obtained 
:l iSstrnguishabte from the maps of alleles 1 and 4 obtained by amplification from bulk genomic DNA 
35 (data not shown). 
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Amplification of new detection mutants of MS32 from slze-selected human genomic DNA 

The ability to amplify faithfully single molecules of MS32 make It possible to amplify and characterize 
de n?vo mXsTlTs locus present In bulk human genomic DNA. An '"f^" ^^^^^^^^ 
^*b^nntts for alleles of similar length but very different internal structure (alleles 31 and 32. Figure 12). 
?CR aS'i r^ Of 30 pg spem, DNA from this individual showed «f^K.,-^^^ 

amSifSon of s^igle minisatellite molecules with no evidence for additional abnormal length alleles In any 

-l^r?^^^^^^ 

S'S=nl/^»^^ 

Seig^r^^^^^ 

alleles were still present in fraction S1 and 52 '"afajina '"J^^^^ oroaenitor alleles yielded more 
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from the Drogenitor alleles (data not shown). , 

single molecule minisatellite deletion mutants were similarly amplified from s.ze-fract.onated DNA 
prepaS fr^m blood taken from the same indivdiual. Again, single molecule amp f.cat.on events were 
3 detectable in blood DNA. but not In control digests lacking blood DNA. Details of the number and 
iToinTmuLi sperm and blood alleles detected are summarized in Table 3. A total of 64 sperm and 
42 blood mutants were observed, with mutants occurring most frequently in the largest size fractions. 

ThVabSity to align all sperm and blood mutants with one or other progenitor « P~;'J«^^^^^^^^^ 
supporting evidence that these short alleles represent bona fide mutant "^o'^c^'^s P;^«=f p" iT^l the 
rather than contaminants introduced during the isolation of sperm and blood DNA. Furthermore, the 
;?peied detection of identical mosaic mutant alleles further confirms the fidelity of single molecule MVR 
mapping. 

EXPERIMENTAL PROCEDURES 
Preparation of DNA 

Human DNAs were provided by CEPH. Paris, and were also prepared from venous blood collected from 
80 unXed Englirh nc^ilals as described elsewhere (Jeffreys and Morton. 1987). Blood DNA for s ngle 
Itecl PCranalvJS^^^^ prepared from 20ml blood collected into 20ml 1 x SSC (saline sodium citrate 
J Tci iLm tS^^^^ Ste pH 7.0). The blood was frozen and thawed ^ojvseer,^^^^^^ 
leucocytes collected by centrifugation at lO.OOOg for 10 minutes. The cell pellet was nnsed 1 x SSC. 
esSnded n 0.2M Na acetate (pH 7.0) and lysed by the addition of SDS to 1%. DNA was cc^lected 
: oXg phe'o, extraction, by three rounds of ethanol precipitation. All °^^fZ^Z\Zn Z'lw 
amlnar flow hood using pipettes, disposable plastlcware and reagents which had not been Previous y 
exposed trthe ?Li^^^^ to minimize the risk of contamination. Sperm DNA was 3;rn.larly 

™ed frorn a Sigle ejaculate from an individual who had avoided sexual intercourse prior to giving the 
SpMtrprercontlnation with DNA from his partner). Semen was diluted wit^ an vo ume of 1 

to 1%. Sperm DNA was collected after phenol extraction by ethanol precipitation. 

Size fractionation of human DNA 

Annn hiood soerm DNA was digested to completion with Sau3A plus Mbol In the presence of 4mM 
sperSeThlo^rusr^^^^^^ 

^r^ntrni Hinfi<?t lackina DNA was loaded into a horizontal 20cm Jong 0,6% agarose gei i&igma lypu . ; 
™"IT«r™ri.>M Na 0.,.M BUTK pH 8.3) SJCJ ~|,» « 

and DNA collected by electroelution and ^^anol preaprtation _ DN^^^ equipment was de- 

Amplification of MS32 alleles 
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6 7mM 2-mercaptoethanol. 4.5uM EDTA, ImM dATP. ImM dCTP. ImM dQTP, 1mM dTTP (Pharmacia). 
110 ug/ml bovine serum albumin (DNase free. Pharmacia) plus lulVI each of oligonucleotides CI plus D. or 
C plus D1, and 5 units Taq polymerase (Amersham). Reactions were cycled for 1.3 minutes at 96 . 1 
minute at 67° and 4 minutes at 70» for 25 - 30 cycles on a DNA Thermal Cycler (Perkin Elmer Cetus). PGR 
products were electrophoresed through a 0.8% agarose gel. and amplified alleles visualised by staining with 
ethidium bromide. Alleles were recovered by electro-elution and ethanol precipitation and dissolved In 5mM 
Tris-HCl (pH 7.0). Other sources of Taq polymerase were equally effective (AmpliTaq (Perkin Elmer Cetus) 
and Type 111 Taq polymerase (Cambio).) 

Human DNA samples for single molecule analysis were centrlfuged to remove any particulate matter, 
diluted with 5mM TrIs-HCl (pH 7.0) plus 0.1 um PGR primers as carrier, and amplified in 7ul PGR "-eactions 
using primers A plus B at 96" for 1.3 min, 64° for 1 minute and 70° for four minutes for 25 - 28 cycles. PGR 
products were detected by Southern blot hybridization with a 32p-labelled MS32 minisatelllte probe as 
described previously (Jeffreys et al.. 1988) PGR products were re-amplified by seeding a 30ul PGR 
reaction containing luM nested 15"rimers G1 Plus D. or C plus D1, with 0.4ul of the initial PGR reaction and 
cycling at 96° for 1.3 minutes, 65° for 1 minute and 70° for 4 minutes for 25-30 cycles. Re-amplified alleles 
were purified by agarose gel electrophoresis as described above. 
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Properties of human minisatellites selected (or PGR amplification. 



Clone 


Locus 


Chromosome 
localisation 


Heterozygosity 
(%) 


No 
alleles 


Allelic length 
range (kb) 


%GC2 


Ref 


pg3 

MS32 

pMS51 

p33.1 

p.33.4 

p33.6 


D7522 

D1S8 

D11S97 


7q36-qter 
1q42-q43 
11ql3 


97 
97 
77 
66 
71 
67 


>40 
>40 
9 
10 
7 
8 


0.6-20 
1.1-20 
1.3-4.3 
1.1-2.5 
0.8-1.3 
0.5-1.0 


66 
62 
69 
56 
68 
70 


[8.10. 32] 
[10. 32] 
[33] 
[5] 
[5] 
[5] 
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Rgure 1). 
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2- GC content of the minisatelllte repeat units. The pMS51 minisataliite repeat unit is 
5^-ACATGGCAGG(AGGGCAGG)nTGGAGGG-3 , where n = 1 or 2 depending on the repeat unit. 



Table 2 



Properties of minisatellite loci detected by locus-specific clones 


probe 


locus 


per cent 
heterozygosity 


chromosomal 
location 


bp 
5 


flanking DNA 
sequenced 3 


Pg3 

pMSI 

pMS31 

pMS32 

pMS43 

pMS51 
pMS228 


D7S22 

D1S7 

D7S21 

D1S8 

D12S11 

D11S97 


97.4 

99.4 

98.0 

97.5 

95.9 (A) 
30 (B) 

77 
94(A) 
85 (B) 


7q36-qter 
1p33-p35 
7p22-pter 
1q42-q43 
12q24.3-qter 

11q13 
17p13-pter 


438 
1284 

399 

207 
14 
(780) 

113 

400 


309 
434 
12 
429 
780 
373 
194 

1057 


Note: in pr 
Restriction 
have there 


y/!S228 only DNA flanking 228B has been sequenced (see higure i). 
mapping suggests that only about 500bp Immediately flanking 228A 
by been omitted. 
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Table 3 



Summary of deletion mutants of MS32 detected in size-fractionated sperm and bl 

Sau 3A plus Mbol 


ood DNA digested with 


Fraction 


Size 
range, kb 


No. hapioid genomes 
analysed xlO*"^ 


No. 
mutants 


No. 
mapped 


No. 
different 


No. repeats 


expected 


observed 


Sperm S1 

S2 

S3 

S4 

S5 

Blood Bl 

B2 

B3 

B4 

B5 


3.0- 3.5 
2.4-3.0 
1.7-2.4 

1.1- 1.7 
0.6-1.1 
3.1-3.8 
2.3-3.1 
1.6-2.3 
1.1-1.6 
0.6-1.1 


0.16 

1.1 

2.9 

3.7 

3.4 

0.12 

0.72 

1.0 

1.5 

1.5 


17 
27 
15 
5 
0 
11 
15 
11 
5 
0 


12 
17 
4 
5 

8 
15 
9 
5 


6 
7 
3 
3 

7 
13 
5 
5 


83-100 
62-83 
37-62 
17-37 
0-17 

86-110 
59-86 
34-59 
17-34 
0-17 


86-118 
59-76 
48-60 
25-34 

91-113 
61-86 
38-62 
20-29 



Table 3 legend 



Soerm and blood DNA from an individual heterozygous for MS32 Sau3A alleles 4.9 and 5.1 kb long 
(alleles 31 and 32. Rgure 12) were digested with Sau3A plus Mbol and size-fractionated by agarose gel 
elec^phoresis as described in Experirr^ental ProcedUTes. The yWof DNA in each fracfon was estimated 
by StrSres^g allquots of eLh fraction against a dilution series of unfractionated digested DNA 
followed by scanrJng densitometry of the appropriate molecular weight region <>" Pi'°*°9^«Pf'^ 
etSum bromide staLd gel. Multiple aliquots of each fraction were amplified using PGR pnmers A and B 
fnd tested ?oT nL^ m^^^^^ as described in Figure 13. The number of hapioid 9enorneB BX^a\Yse6 or 
mutants of a given Sze class was derived from the yield of each fraction, assuming 3pg DNA per hapioid 
genome and co^^^^^^^^^ for the estimated proprotion (40-/o) of single MS32 molecules which success^u l y 
IZm The observed number of repeats in each mutant molecule was estimated from the size of the .nhial 
pSS amplified with primers A plus B. and confirmed by direct determination of the number of repeats .n 
mutants subjected to internal mapping. 



Figure 1 



Primers and hybridization probes used in the amplification of mlnisatellites by PGR. Each m-nls«e 
locus was amplified using 20- or 24.mer primers A and B located in unique sequences flanking DNA a and 
X TespeSivSy from 'the mlnlsatellite. PGR products were detected by hybridization w-th an Internal 
minisaSe probe isolated by cleavage with restriction endonucleases X and Y which cleave c and d bp 
fTniTe iLtelllte. Details W the six -Inisatallites are as « ^j-J -^P 

«^ A «;' ArrACAGQCAQAGTAAGAGG-3 , B = 5 -CCACCCTQCTTACAQCAATQ-S . X - Pst I. Y - uoei. 
' f'-M b - 58 r= ard = 45 R = 37: MS32. A = s'-TCACCQGTQAATTCCACAQAGACT-S .8 = 5- 

Styl, Y = Drain, a = 14. b = 45. c = 10. d = 16. R = 37. 
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Figure 2 

Amplification of MS32 minisatallite alleles by PGR. A» 0.1_g aliquots of DNA from CEPH individuals 
2306. 10208. 133101 and 133304. which together contain 8 different MS32 alleles ranging from 1.1 to 17.9 
kb. were pooled and amplified for 10-20 cycles in lOul reactions containing 1 unit Taq polymerase plus 
flanking primers A and B. PGR products were separated by electrophoresis in a 1% agarose gel and 
detected by Southern blot hybridization with a minisatellite probe. Taq polymerase extension times at 70° 
were for 6. 15 or 30 minutes with ( + ) or without (-) addition of extra polymerase (1 unit) at the 10th cycle. 
H. 2ug of each CEPH DNA digested with Alul; Alul sites flanking MS32 are located such that each Alul 
allele is 0.2 kb longer than its corresponding PGR product. Autoradiography was for 5 hours (cycles 10, 14) 
or 1 hour (17. 20) without an intensifier screen. B. effect of increasing concentration of Taq polymerase^ (a-c. 
0.5. 1 , 2 units respectively) on the efficiency of amplification of large alleles. The extension time at 70° was 
15 minutes. 



Figure 3 

Efficiency of amplification of MS32 minisatellite alleles as a function of allele length, with PGR extension 
times of 6 minutes (0) or 15 minutes (0). The gain in product per amplification cycle was determined by 
scanning laser densitometry of tracks H. 6+ and 15+ of Figure 2. exposed to pre-flashed X-ray film without 
an intensifier screen. The mean estimates of gain per cycle determined up to cycle 10. from cycle 10 to 14 
and from cycle 14 to 17 were in close agreement, indicating that the yield of PGR product is increasing 
exponentially at least up to cycle 17; the Figure shows the mean value of the three estimates of the gain for 
each allele. 



Figure 4 

Go-amplification of two minisatellites from single cell equivalents of human DNA. A. 60 or 6pg aliquots 
of DNA from blood from an individual heterozygous for alleles a. b at XMS32 and c. d at pMS51 were 
amplified for 25 cycles with 15 minutes extension times in the presence of primers A and B for both loci, 
followed by Southern blot hybridization analysis of amplification products. Low levels of allele a could be 
detected in three of the epg samples on prolonged autoradiographic exposure (arrows). B, analysis of 
spurious amplification products of MS32. Two 60pg aliquots of DNA were amplified for 30 cycles, followed 
by digestion with S1 nuclease (SI). Bgll (B) or Hpal (H). Bgll cleaves once in the flanking DNA. between the 
MS32 minisatellite and primer B. andTemoves 311 bp of flanking DNA. Hpal cleaves between pnmer A and 
the minisatellite. removing 195bp of flanking DNA (see Figure 1 legend). 



40 Figure 5 

Co-amplification of six different human minisatellites by PGR. A. amplification of lOng (first four lanes) 
or ina DNA (last two lanes) from Individual 1 for 15 or 18 cycles respectively, using a cocktail of pnmers A 
and B for ministatellites pXg3. MS32. pMS51. 33.1. 33.4 and 33.6. PGR products were detected by 
Southern blot hybridization with a cocktail of ail six 32p.|abelled minisatellite probes. The »ndivldual tested 
had been previously characterised at ail six loci separately, which enabled all hybridizing DNA fragments to 
be assigned as shown; this individual is heterozygous at all six loci. These DNA fingerprints are from three 
separate experiments. Note that 33.4 has failed to amplify in the last tract. B. DNA f»"9f P""*r^^.*^^^ 
unrelated individuals (2-9.) following amplification of Ing samples of DNA for 18 cycles. C, DNA fingerprints 
of 3-generation family (GEPH kindred 1435). following amplification of long DNA for 15 cycles. Three bands. 
corresDonding to alleles of 33.4 and pMS51, failed to amplify in individual 12. as shown by a second 
oMhl f^^^^^ (first tract, bands marked with an asterisk). In all experiments. PGR products were 
SsSd with SI nuclease (see Materials and Methods) prior to gel electrophoresis, to reduce background 
labelling. DNA-free controls in all experiments were consistently blank (not shown). 
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Variability of DNA fingerprints producted by co-amplifying six minisatellites simultaneously 1ng 
sampreTo DNA from 21 unrelated individuals were analysed as described in Rgure 5. A vanatjon m the 
number oVresolvable DNA fragments per individual. The mean number of bands resolved per d.fferences 
se^n belen the DNA fingerprint of pairs of unrelated individuals, based on 29 independent pa,rw^e 
comparSTns The number of discordant bands is the total number of bands not shared by the t«o 
indTvSs being compared. The theoretical maximum number of discordancies with 6 loc. ts 24. and he 
obTe^ed m^^^^^^^ 2.8 (S.D.). The distribution of discordancies approximates to a Pcsson distnbution 

with this mean (dots). 



Figure 7 

Amollfication of minisatellites from single human cells. A. samples containing 0. 1 3 small 
lymphoS es ^^^^^ analysed in Figure 5A) were iysed either with proteinase K plus SDS or by 

heating in water, followed by co-ampllfication with primers for MS32. plvlS51. 33. . 33.4 and 33.6 for 27 
Sc es PGR products were Southern blot hybridized sequentially with each of the five -^'"^atell-te probes. 
B^TmplHication products of single buccal cells, analysed following lysis with Proteinase K p,us SDS and 
pen as above. Cells from two individuals, a and b were tested: b is homozygous at pMS51. 33.1 and 33.6. 
0. no cell control. Spurious PGR products are indicated with arrows. 



Figure 8 

Structure of piasmid clone piVlS228. Boxed regions represent minlsatellite arrays^ 
minisatellite regions discernible; 228A detects the more strongly hybr,d.s.ng bands and 2288 the famter 
bands on high stringency hybridisation of human DNA. 



Rgure 9 

DirLt detection of amplified MS32 minisatellite alleles ^Vj'^^^-P^-^^^^ 
r.^i^ fniinwinn 30 cvcles of PGR OP 0.4ug human genomic DNA. M = marker DNA (0.5ug uina x nina ui 

corresponding to collapsed minisatellite alleles. 0 = no DNA control. 



Figure 10 

MVR man of the smaller MS32 allele in CEPH individual 10202. A. Hinfl (F) and Haelll (H) partial digest 
cleavage sites in the allele. The locations of PGR primers A and B are shown. 



Figure 11 



sequences are A. 5 -TCAGCQQTQAATTCCACAQACA^^^ Derivatives G1 

C. S'-CTTG^^^3TTCTC^^^^^^ c?nS ^eSn^^S^^^^^ EcoRi site (underlined) 

and D1 with a 5 extension TCACCBB TGAA [ ^J ^^"^ « , ^ , end-labSRH^g and mapping, B. 
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genomic DNA using primers C1 plus D (left) or C plus D1 (right), end-labelled at the EcoRI site and partially 
digested with Hinfl (F) or Haefll (H). followed by agarose gel electrophoresis and autoradiography. The 
distribution of internal minisatiilite repeat units In this allele cleaved or not cleaved by Haelll is shown in 
(A). 



Figure 12 

MVR maps of MS32 alleles sampled from various populations and aligned to shown map similarities. 
10 Each allele is coded according to whether each repeat unit is cleaved (a, A) or not cleaved, (t. T) by Haelll. 
Each allele Is given a serial number (1-32) in order of allele length together with population origin (A, Amish; 
E English. F. French; M. Mornion; V. Venezuelan) and number of repeat units. High order tandem 
repetitions in the MVR maps are indicated below alleles by ~>. MVR map regions common to more than 
one allele were identified by dot matrix comparisons of all painwise combinations of alleles. Gaps (-) were 
introduced to optimize alignments. Four common 5 haplotypes (1. 2A. 2B, 3) were so Identified. Haplotype 
1 (uppercase) extends for 20-21 repeat units. Haplotype 2A (uppercase) is more variable in length. 
Haplotype 2b consists of the first 15 repeats of haplotype 2A (uppercase) followed by a variable number of 
additional repeat units (lowercase underlined). The short and provisionally identified haplotype 3 
(uppercase) appears to be preferentially associated with alleles containing mainly a-type repeat units. Three 
20 remaining alleles "others" show no significant similarity to alleles classifiable into the 4 main 5 haplotypic 
groups. Matches between identical (3, 4) and very similar (24. 25. 26) alleles are indicated by vertical lines. 



IS 



Figure 13 

Amplification of single molecule mutant MS32 alleles from size-fractionated sperm DNA. A. fidelity and 
efficiency of single molecule PGR. 30pg aliquots of a Saui3A plus Mbol digest of sperm DNA from an 
individual heterozygous for alleles 31 and 32 (Rgure 3) were amplified for 28 cycles using PGR primers A 
plus B (Rgure 1A). PGR products were electrophoresed through a 0.8% agarose gel and Southern blot 
hybridized with a s^P-labelled MS32 mlnisatellite probe. 28 out of 40 duplicate PGR reactions showed PGR 
products from both alleles. From the poisson distribution, this indicates a mean of 1.8 successful 
amplification events per allele per reaction, compared with a mean input of 5 molecules of each allele. Thus 
36% of input MS32 molecules give a PGR signal. B. single molecule amplification events from sperm DNA 
digested with Sau3A plus Mbol and size-fractionated by gel electrophoresis. Multiple aliquots of DNA from 
fractions Sl-SSlich contalFSH^ respectively DNA from the equivalent of 0.03. 0.15. 0.6. 1.2 and 1.0 _g total 
sperm DNA were amplified for 25 cycles using PGR primers A plus B. PGR products were detected by 
Southem blot hybridization with an MS32 mlnisatellite probe. H. 30pg unfractionated sperm DNA. The size 
ranges of each fraction is shown. G. mutant minisatellites in (B) re-amplified using the nested PGR pnmers 
G1 plus D (Rgure 11 legend) for 25-28 cycles followed by agarose gel electrophoresis and staining with 
40 ethidium bromide. 
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Claims 

45 1 A method of characterising a test sample of genomic DNA by reference to one or more controls, 
which method comprises amplifying the mlnisatellite sequence at at least one informative locus in the test 

^Hybridising the test sample with primer in respect of each informative locus to be amplified, the primer 
being hybridisable to a single strand of the test sample at a region which flanks the mlnisatellite sequence 
50 of the informative locus to be amplified under conditions such that an extension product of the primer is 
synthesised which is complementary to and spans the said mlnisatellite sequence of the strand of the test 

Orseparating the extension product so formed from the template on which it was synthesized to yield 

55 tT^X^w6^b^S^he primer of step (1) with single stranded molecules obtained according to step 

(ill unde? conditions such that a primer extension product is synthesised from the template of at least one of ^/ 

the single stranded molecules obtained according to step (ii). and 

(iv) detecting the amplification products and comparing them with one or more controls; 
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the method being effected such that sufficient of the desired extension P™d"'r\'%?f ^^^f J° 
detectebie but such that the yield of extension product is inadequate to permit substantial out-of-reg>ster 
hybridisation between complementary minisatellite template strands. ■ „ctrir«nn 

2 A method as claimed in claim 1 wherein the test sample of genomic DNA Is subjected to restnction 
prior to amplification and only one primer is used In respect of each informative locus to be amplified. 

3 A method as claimed in claim 1 which comprises 

(i) hybridising the test sample with two primers in respect of each infomiative '°°"V\ nllttS^ 
primer being hybridisable to single strands of the test sample at a region which flanks the rrunisatenite 
sequence ofthe informative locus to be amplified under conditions such that an extension Product of each 
Tmer is synthesised which is complementary to and spans the said minisatollrte sequence oi eacl^Jrana 
o Se test sample whereby the extension product synthesised from one primer, when it is separated from 
its complement, can serve as a template for synthesis of the extension product of the other pnmer; 
irsepS the extension product so formed from the template on which it was synthesised to yield 

S?r:ruS'hybt^^^^^^^^^^ Primers of step (i) with the single stranded °f '"tro^S of 

siep di) under conditions such that a primer extension product is synthesised from the template of each of 
the single stranded molecules obtained according to step (ii); and 
(iv) detecting the amplification products and comparing them with one or more controls: 
he method being effected such that sufficient of the desired extension product is generated to be 
deteSalrbutTuch that the yield of extension product is inadequate to permit substantial out-of-register 

^^Trrthr;=^^^^ -es of an informative locus in the 

*"^Tmerod°^"s'da™ary one of the previous claims which includes the use of at least one 
enzyme whfch specifically digeste or degrades single stranded DNA whilst leaving double stranded DNA 
intact whereby to alleviate the formation of aspecific amplification products. „ . ^ • u 

6 A method as claimed in any one of the previous claims wherein amplification -s effected .n buffer of 
reduced iTnYc strength and at an elevated annealing temperature whereby to alleviate m.sprimmg 

7. A Shod i claimed in any one of the previous claims wherein more than one informative locus is 

'"?:'mthTarctLd in any one of the previous claims for the characterisation of one molecule of an 

. H'„?r™^^frDNA St sample at a r~lor «hlet. Hanks me minisatellite sequence ot an mlonnative 
t:^:::lZ:l!^ZX^Zt^ e«o.^ P--^ ^ complement.,, tc an. spans the said minlsate. 

rATx::,e":^cTccmp'lfSe'U« cop^ o. a pc^nt^leo^e extension product as claimed 

"'"fr^ as claimed In dalm 11 al»v, which comprises the prodocts of at leas. 3 cycles ol the 

^"Crr^'^ll^rpS^c^dr pir lo;'«ap,l^.n, tne mln,satelll.e se,ue^ « a. ,ea« c^ 

'TZZrSl^^>^<^oZrc^ the pier Is synthesised which is complomentan- to 
^'iTnS sCen» 0, the strand o. «,e .es, sampte: and the «t ..rther comprising a 

""';t^Z» L^tdt rr*l» "^pHses ..o complementary ,an«n, po,ynu.eo«de pHmers ^ 

respect of each informative locus. 
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